OpenAI Chief Technology Officer recently revealed that Sora is expected to release plans this year to add voice functionality and these "dry goods"

OpenAI seems to have already warmed up for the birth of Sora.
On Wednesday, March 13th local time, Mira Murati, Chief Technology Officer of OpenAI, gave a video interview to The Washington Post.
Murati revealed in an interview that the cost of using Sora to generate videos is currently very high, and the team is optimizing the technology, which is expected to be officially released this year.
Murati revealed that the OpenAI artificial intelligence cultural and educational video model Sora will be officially released to the public later this year. OpenAI plans to eventually add audio functionality to make the scene more realistic, and will also allow users to edit the video content generated by Sora.
In addition, Murati also answered several questions about Sora's uniqueness, how to correct flaws, whether audio will be included, and where the training data comes from. The answers given are either vague or sincere.
How does Sora turn words into magic? Please imagine a scene: "A mermaid and her crab companion are browsing smartphones together..."
As a benefit of this media interview, the host was given the opportunity to have Sora convert multiple sets of text prompts she provided into video images, and the above scene is a frame in the video provided by Sora.

Video screenshot

How did Sora achieve this transformation? Murati stated that although explaining the evolution of mermaids may be much easier than explaining the internal workings of diffusion models, in short, artificial intelligence models analyze a large amount of videos and learn to recognize objects and actions. Then, when you give it a text prompt, it will outline the entire scene and fill in each frame.
When asked what training data OpenAI used for Sora, Murati pointed out, "We used both public and authorized data.".
In another video, the host asked Sora to create a content that better suited the interview: "Two professional women in their 30s with brown hair sit down in a well lit studio to receive news interviews."

Video screenshot

In the end, in the "homework" handed over by Sora, everything looked so real, whether it was the mouth shape and hair movements of the two women, or the details on the leather jacket. Murati pointed out that Sora took a few minutes to complete this 20 second 720p resolution short film, but it has not yet been able to support sound effects.
But Murati has promised that they plan to eventually add voices.
Murati also stated that the current cost of Sora generating videos is much higher than the company's image generator Dall-E. However, when officially released to the public in the future, OpenAI will be optimized to reduce the demand for computing power.
On February 16th, Beijing time, OpenAI released the Sora, a cultural and educational video model, which was stunning and exploded worldwide. Only 14 months have passed since OpenAI launched ChatGPT, ushering in the era of generative AI, and the speed of AI evolution is astonishing.
In the video generated by Sora, a woman is walking on a neon street wearing a black leather jacket and a red dress. Not only is the subject coherent and stable, but there are also multiple shots, including slowly cutting from the street scene to close-up shots of the woman's facial expressions, as well as the reflection of neon lights on the damp street floor.
Sora's research findings indicate that expanding video generation models is a highly promising approach to building a universal simulator for the physical world, which takes artificial intelligence to a new level of understanding and simulating the physical world in motion.
Industry insiders predict that General Artificial Intelligence (AGI) will arrive earlier than expected, and the industry gap will widen. In addition, the disruptive impact brought by cultural videos has raised concerns among people, and there are also visionaries repeatedly warning about the blurred boundary between reality and virtuality. But what is less controversial is that Sora is believed to have the potential to accelerate the implementation of AI applications.
At the same time, the birth of Sora has sparked more expectations for the future development of AI. With the continuous advancement of technology, AI will play a greater role in more fields. Whether in the fields of industrial production, education and training, or entertainment and leisure, AI will bring more surprises and possibilities to humanity.
On February 16th, 360 founder Zhou Hongyi posted on Weibo expressing his views on Sora. Zhou Hongyi believes that the birth of Sora means that the implementation of AGI (General Artificial Intelligence) may be shortened from 10 years to one or two years.
When it comes to the biggest advantage of Sora, Zhou Hongyi said that in the past, cultural video software used to operate on graphic elements on a 2D plane, which can be seen as a combination of multiple real images, without truly mastering the knowledge of the world. But in the videos produced by Sora, it can understand that tanks have tremendous impact like humans. Tanks can crash cars, without the situation where cars crash tanks. "This time, OpenAI uses its advantage of large language models to enable Sora to achieve two levels of understanding of the real world and simulation of the world. Only in this way can the generated videos be real and simulate the real physical world beyond the scope of 2D."
Zhou Hongyi mentioned that with the foundation of big model technology and the guidance of human knowledge, super tools can be created in various fields. For example, in biomedical, protein, gene research, including physics, chemistry, and mathematics, big models will play a role.
"Once artificial intelligence connects to the camera and watches all the movies, YouTube and TikTok videos, the understanding of the world will far exceed that of text learning. A picture is worth a thousand words, and this is really not far from AGI. It's not a problem in 10 or 20 years, it may be achieved soon in one or two years," Zhou Hongyi sighed.
Daily Economic News Comprehensive OpenAI Official Website, Daily Economic News, Public Information

浏览过的版块