Outlines
- Section 1 compares videos generated by $\texttt{T2V-Turbo}\text{ (VC2)}$ and $\text{VCM (VC2)} + \mathcal{R}_\text{img}$
- Section 2 presents videos corresponding to Figure 1 of the paper.
- Section 3 presents videos corresponding to Figure 4 of the paper.
- Section 4 presents videos corresponding to Figure 6 of the paper: ablation study on the choice of $\mathcal{R}_\text{img}$
- Section 5 presents videos corresponding to Figures 7 & 8 of the paper: qualitative comparison results for our $\texttt{T2V-Turbo}\text{ (VC2)}$
- Section 6 presents videos corresponding to Figures 9 & 10 of the paper: qualitative comparison results for our $\texttt{T2V-Turbo}\text{ (MS)}$
1. Comparing videos generated by $\texttt{T2V-Turbo}$ and $\text{VCM} + \mathcal{R}_\text{img}$
**Prompt: A panda standing on a surfboard in the ocean in sunset.
Untitled
$\qquad\qquad\quad\texttt{T2V-Turbo}\text{ (VC2)}$
Untitled
$\qquad\qquad\quad\text{VCM}\text{ (VC2) } + \,\mathcal{R}_\text{img}$
Analysis: Left The panda is indeed standing on the surfboard. Right The panda is sitting on the surfboard.
Prompt: A raccoon is playing the electronic guitar
Untitled
$\qquad\qquad\quad\texttt{T2V-Turbo}\text{ (VC2)}$
Untitled
$\qquad\qquad\quad\text{VCM}\text{ (VC2) } + \,\mathcal{R}_\text{img}$
Analysis: The right video can only generate a plausible raccoon but fails to model the activity of playing the electronic guitar.
******Prompt: A motorcycle accelerating to gain speed
Untitled
$\qquad\qquad\quad\texttt{T2V-Turbo}\text{ (VC2)}$
Untitled
$\qquad\qquad\quad\text{VCM}\text{ (VC2) } + \,\mathcal{R}_\text{img}$
Analysis: The motorcycle on the right is actually moving backward.
******Prompt: A squirrel eating a burger
Untitled
$\qquad\qquad\quad\texttt{T2V-Turbo}\text{ (VC2)}$
Untitled
$\qquad\qquad\quad\text{VCM}\text{ (VC2) } + \,\mathcal{R}_\text{img}$
Analysis: Compared to the left video, the squirrel in the right video is more like holding a burger without the eating motion.