Challenge OpenAI, Google's new move! Significantly updated generative AI, launching video model VEO 2 and the latest version Imagen3

Google DeepMind, the flagship AI research laboratory of Google (GOOGL, stock price $196.66, market value $2407.3 billion), significantly upgraded its AI driven content generation tool on Monday, launching the Veo 2 video generation model and an enhanced version of the Imagen 3 image model, challenging OpenAI's leading position in AI image and video generation. Google stated that these updates are expected to completely change the creative workflow, providing video and image creators with higher realism and customized experiences.
According to Google, Veo 2 is a video generation tool that can generate high-quality videos with diverse themes and styles. Google stated in its blog that this model excels in realism, capturing details such as human expressions and movie effects. Its enhanced understanding of physics and film enables users to generate stunning content, including tracking shots and wide-angle compositions.
For example, Veo 2 is familiar with the language of movie shooting, and users can request a certain type of style, specify the lens, and suggest movie effects. Veo 2 will present videos at up to 4K resolution and extended to several minutes in length. It is worth noting that this resolution is 4 times that of the OpenAI Sora model, and the video duration is more than 6 times longer.
However, these advantages are still theoretical at present. In Google's experimental video creation tool VideoFX, videos generated by Veo 2 are limited to 720p resolution and 8 seconds in length. (In contrast, Sora's maximum output is 1080p, 20 second short films.)
Google stated that although video generation models often "hallucinate" unnecessary details such as extra fingers or unexpected objects, Veo 2 performs more realistically in this regard with a lower frequency of generation errors. In addition, the videos generated by Veo 2 include invisible SynthID watermarks to mark them as AI generated content, thereby reducing the risk of misuse or incorrect attribution.
DeepMind's Vice President of Product, Eli Collins, told the media that as the model gradually becomes ready for large-scale use, Google will provide Veo 2 through its Vertex AI developer platform.
Developers and creators can currently access the tool through Google Labs, and it is expected to be widely integrated into platforms such as YouTube Shorts by 2025. Meanwhile, the Imagen 3 model has been enhanced in terms of image composition and detail accuracy, supporting various styles from realistic to abstract, generating richer textures, and responding more faithfully to user prompts.
Currently, Imagen 3 has been launched in over 100 countries through Google Labs' ImageFX tool, allowing global users to experiment with its cutting-edge features.
In addition, Google has also launched Whisk, a creative tool that combines the visual analysis capabilities of Imagen 3 and Gemini. Users can input images, generate detailed text descriptions, remix styles, or design personalized works such as digital dolls or enamel badges.
Google introduced that Whisk combines the Imagen 3 model with Gemini's visual understanding and descriptive capabilities. The Gemini model will automatically generate detailed textual descriptions for the user's images and pass these descriptions to Imagen 3. This process allows users to remix themes, scenes, and styles in interesting new ways.
On December 10th Beijing time, Google announced the development of its new quantum chip Willow. This powerful chip has achieved a crucial breakthrough in the field of quantum computing over the past 30 years, completing tasks that today's computers take 10 years to complete in just 5 minutes. The research results were published in the journal Nature on December 9th.
After the news came out, the quantum information industry cheered and the AI circle was also greatly shocked.
Willow's major breakthroughs are reflected in two aspects: one is the significant increase in performance, that is, computing power. 5 minutes of computation is equivalent to a task that the fastest computer currently can complete in 10 years. 10& sup2; Years are much older than the age of the universe (about 13 billion years). 5 minutes and 10& sup2; In the year, this comparison shows that the leap in computing speed is very terrifying.
The second is the powerful quantum error correction capability. Willow's significant progress in the field of quantum error correction is that, based on a scalable square grid, the number of logical qubits (currently 105 qubits) increases while the error rate rapidly decreases. It expands from 3x3 encoded qubits to 5x5 grids, and then to 7x7 grids, with each expansion halving the error rate. Moreover, Willow can perform real-time error correction, making it possible to scale to higher order qubits (such as 1050) in a short period of time.
The above two major breakthroughs, compared to performance improvement, have attracted more attention from scientists in terms of error correction capability.
Quantum chips are the core of quantum computers. Willow's research and development team is the Google Quantum AI Laboratory led by Hartmut Neven. Hartmut stated that Willow is a big step towards large-scale, self correcting quantum computers, whose error correction capabilities and beyond classical computing power bring us closer to a system that can provide commercial applications, from helping discover new drugs, to designing more efficient electric vehicle batteries, to accelerating progress in nuclear fusion and new energy alternatives.
Daily Economic News Comprehensive Google, Public Information
Disclaimer: The content and data in this article are for reference only and do not constitute investment advice. Please verify before use. Based on this operation, the risk is borne by oneself.

浏览过的版块