Google shows off Lumiere, a space-time diffusion model for realistic AI videos

Jan 25, 2024

Google shows off Lumiere, a space-time diffusion model for realistic AI videos

Posted by Genevieve Klien in category: robotics/AI

Lumiere, on its part, addresses this gap by using a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model, leading to more realistic and coherent motion.

“By deploying both spatial and (importantly) temporal down-and up-sampling and leveraging a pre-trained text-to-image diffusion model, our model learns to directly generate a full-frame-rate, low-resolution video by processing it in multiple space-time scales,” the researchers noted in the paper.

The video model was trained on a dataset of 30 million videos, along with their text captions, and is capable of generating 80 frames at 16 fps. The source of this data, however, remains unclear at this stage.

0 comments