Just one week after OpenAI released Sora, Google DeepMind has released Veo 2, a vid-gen model pushing hard at the current boundaries of AI-powered video creation.

The model is novel in a number of ways - it's designed to generate high-quality, 1080p resolution videos that can exceed a minute in length, capturing a wide range of cinematic and visual styles.

Key features & example:

  • Creates realistic video in phenomenal resolution [up to 4k]
  • Understands a variety of camera shots [drone, wide, close-up etc]
  • Better recreates real-world physics & human expression
Prompt: A low-angle shot captures a flock of pink flamingos gracefully wading in a lush, tranquil lagoon. The vibrant pink of their plumage contrasts beautifully with the verdant green of the surrounding vegetation and the crystal-clear turquoise water. Sunlight glints off the water’s surface, creating shimmering reflections that dance on the flamingos’ feathers. The birds’ elegant, curved necks are submerged as they walk through the shallow water, their movements creating gentle ripples that spread across the lagoon. The composition emphasizes the serenity and natural beauty of the scene, highlighting the delicate balance of the ecosystem and the inherent grace of these magnificent birds. The soft, diffused light of early morning bathes the entire scene in a warm, ethereal glow.

[You can explore more prompt & video-generation examples on the official DeepMind release here].

Veo 2 vs Sora; DeepMind vs OpenAI

Veo 2 and OpenAI's Sora are both groundbreaking AI video generation models, each with its own strengths.

While Sora excels in creative storytelling and imaginative scenarios, Veo 2 prioritizes realism and adherence to real-world physics. Veo 2 also offers a higher degree of control over the video generation process, allowing users to specify camera angles, lighting, and other cinematic elements.

Google's direct comparison tests, utilizing 1,003 prompts from Meta's MovieGenBench dataset and human evaluation of 720p, eight-second video clips, revealed Veo 2's superiority over competitors like OpenAI's Sora Turbo.

Limitations

While Veo 2 has made significant strides, Google acknowledges the ongoing challenges in consistently generating realistic and dynamic videos, especially in complex scenes and motion sequences.

To mitigate potential misuse and ensure transparency, Veo 2's initial rollout will be limited to select products like VideoFX, YouTube, and Vertex AI. In 2025, the model's reach will expand to platforms like YouTube Shorts. All AI-generated videos will be marked with an invisible SynthID watermark.

Other releases

DeepMind also unveiled an enhanced Imagen 3 model, delivering brighter, better-composed images with richer details and textures. This model also excels in rendering diverse art styles with greater accuracy. It is currently being rolled out globally to ImageFX.

Additionally, Google Labs has introduced a new "Whisk" experiment that leverages the updated Imagen 3 and Gemini's visual understanding capabilities. This experiment allows users to prompt with images, showcasing the advancements in AI-powered image generation.