Unlocking AI's 3D Narrative: Li Feifei and Google Take the Lead
hughmini
发表于 6 天前
1159
0
0
AIGC's 3D track suddenly became lively.
On December 5th, Google DeepMind released the new generation world model Genie 2, which can "generate a 1-minute game 3D world from a single image", causing netizens to exclaim that "the hacker empire is here".
Just two days ago, "AI godmother" Li Feifei's World Labs officially announced a "spatial intelligence" model that supports "generating a 3D world from one image".
This is another wave of discussion on world models after Sora. From text to images, and then to videos and interactive 3D worlds, AIGC has made significant leaps overall.
For the industrial sector, creative design work and interactive experience workflows have received strong support. The world model can provide infinitely diverse and controllable 3D environments for agent training, embodied intelligence training, complex animation production, game production, physics modeling, and other fields.
Some industry insiders also say that the progress of the world model means that the ultimate AGI (General Artificial Intelligence) is one step closer.
Google expands the breadth towards AGI
Genie 2 is Google's second-generation world model, which can generate an operable 3D environment through keyboard and mouse input given an image.
The characters in the image can be recognized by the keyboard and respond to intelligent operations.
The same starting frame can generate different motion trajectories.
Genie 2 has consistent memory before and after, and even when the surrounding scenes are not visible, there will be no distortion.
What's valuable is that Genie 2 can generate new scenes in real-time based on the visuals, with a maximum duration of one minute.
This interface has similarities with games.
Games play a crucial role in the field of artificial intelligence research. Their captivating graphics, unique challenge combinations, and measurable advancements make them an ideal environment for safety testing and advancing AI functionality, "Google admitted." In fact, games have always been important to Google DeepMind and an important way for Google to train agents
However, the industry has encountered bottlenecks in the training of embodied intelligence.
A sufficiently rich and diverse training environment is necessary to promote practical progress in embodied intelligence. 21st Century Business Herald reporters learned from industry insiders in the humanoid robot industry that currently, generalization ability is a major pain point for humanoid robots.
Genie 2 is expected to help embodied intelligence solve training bottlenecks.
In terms of interactive functions, Genie 2 can model interactive relationships, such as blasting balloons, opening doors, and shooting explosive barrels.
This makes it much simpler to create diverse interactive scenes. By utilizing Genie 2 to rapidly build various interactive experience prototypes, researchers can quickly train and test embodied intelligent AI in new environments.
For example, using different images generated by Imagen 3 to prompt Genie 2 to model the differences between paper airplane, dragon, eagle, or parachute flight, and test Genie's ability to control different objects.
That is to say, AI agents can obtain almost infinite training scenarios and interaction systems in the world model.
Although this research is still in its early stages, Google researchers believe that Genie 2 is an effective path to addressing the structural issues of safety training embodied intelligence, unlocking the next wave of capabilities in embodied intelligence, and achieving the breadth and generality required to move towards AGI.
Li Feifei realizes the concept of spatial intelligence
World Labs is the first entrepreneurial project of renowned AI scholar and Chinese scientist Feifei Li, established in January 2024. By the time the company was founded six months ago, its valuation had exceeded $1 billion.
This is a space intelligence company dedicated to building large-scale world models that can perceive, generate, and interact with the 3D world. The plan is to generate virtual 3D spaces where users can manipulate variables and allow people to "create their own 3D worlds". World Labs points out that its software will be helpful to various practitioners, including artists, designers, developers, and engineers.
On December 3rd, World Labs submitted the 1.0 version assignment.
A 3D world can be generated from a single image, and users can essentially "step into" any image and explore in 3D.
The tool is also equipped with controllable sliders to adjust the simulated depth of field and simulated push-pull zoom. It supports adjusting the camera's position and field of view, changing object colors, creating spotlight effects, automatic dynamic effects, and other interactive methods, enriching the visual experience and providing a stronger sense of control.
Like Genie 2, World Labs' spatial intelligence models can also ensure consistency in the 3D world, making scenes more durable and existing once generated; Users can control and move the scene in real-time, and carefully observe the details in the scene.
The world model follows the basic physical rules of 3D geometry, combining realism and depth, effectively improving the controllability and consistency of content, and changing the way movies, games, simulators, and other digital representations of the physical world are made.
Jim Fan, Senior Research Scientist at NVIDIA, commented that "GenAI is creating increasingly high-dimensional snapshots of human experiences. Stable Diffusion is a 2D snapshot; Sora is a snapshot of 2D+time dimension; And World Labs is a 3D, fully immersive snapshot
At present, Worldlabs has opened up waiting list applications to the public, and some creators can already integrate this AI tool into their existing workflows.
In the field of film and television production, AI's 3D narrative capability will greatly improve the efficiency and quality of content creation, and reduce production costs. Creators can generate virtual scenes and characters more quickly, and use AI generated 3D worlds to build richer and more diverse story backgrounds, bringing audiences a brand new visual experience.
For example, using Worldlabs technology to generate virtual shooting scenes before filming helps directors and photographers better plan shots and scene arrangements, improving shooting efficiency and accuracy.
For the gaming industry, 3D generation will bring more possibilities for game development. Developers can use AI to generate more realistic and delicate game scenes and characters, enhancing the immersion of the game.
In the field of education, 3D content generated by large models can create more vivid and intuitive teaching scenarios, enhancing the experience of subjects such as science and history.
Li Feifei believes that "spatial intelligence" is a key part of the AI puzzle. She said in a TED talk in April this year, "Vision becomes insight; insight becomes understanding; understanding drives action. All of this generates intelligence
The space intelligence field represented by Genie 2 and Worldlabs is an important new direction for the development of AI technology. It breaks through the limitations of traditional AI on a two-dimensional plane, expanding AI's perception and understanding capabilities to three-dimensional space, making it more intuitive and closer to the essence of interaction.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
-
10月末に2800ドルのマイルストーンを突破した後、国際金価格は短い調整を経た。 11日のニューヨーク商品取引所で来年2月に引き渡されたCOMEX金先物は1.5%近く上昇し、2750ドルの関門を再び奪還し、最新のインフレデ ...
- 什么大师特
- 8 小时前
- 支持
- 反对
- 回复
- 收藏
-
米大統領選後の株式市場の上昇は年末まで続いており、ウォール街の大物たちが叫んだ来年の目標価格も年々上昇しているが、上昇を追う際にはすべての慎重さを捨ててはならないと警告するアナリストも少なくない。 一 ...
- SOHU
- 昨天 11:47
- 支持
- 反对
- 回复
- 收藏
-
12月10日夜、米株が取引を開始し、市場の注目はグーグルに集中し、終値までにグーグルA(GOOGL)は5.59%上昇し、185.17ドルだった。その時価総額は一夜にして1120億ドル(約8120億元)も大幅に増加した。 情報面で ...
- 内托体头
- 昨天 11:15
- 支持
- 反对
- 回复
- 收藏
-
①北京時間の今夜21時30分に発表された米国の11月CPIデータを、「2024年最後の重量級の米国経済指標」にたとえても、誇張ではないようだ。②FRBが来週12月の金利決定会合を開催するにあたり、今晩のCPIもFRBが金利 ...
- 不正经的工程师
- 昨天 10:29
- 支持
- 反对
- 回复
- 收藏