Google and OpenAI product showdown: Can AI assistants become killer applications?

This week, the headline news in the field of artificial intelligence is undoubtedly the product showdown between OpenAI and Google.
OpenAI has always loved to release its own products before major product launches by competitors, in order to grab the news spotlight, and this week is no exception.
OpenAI had previously given the public high expectations, and on Monday (May 13th), the company announced an upgraded version of GPT-4 as scheduled, named GPT-4o ("o" represents omni omni). The GPT-4o is designed to serve as a personal assistant on mobile phones or tablets, with improved voice interaction capabilities, the ability to interpret and infer photos taken by device cameras, stronger language translation capabilities, and faster response times.
The technological innovation behind GPT-4o is impressive. The model is multimodal and can receive, reason, and generate any combination of text, audio, and image outputs in real-time. Compared with previous versions, this model eliminates the step of converting user voices into text and processing it, which means the entire process is faster.
GPT-4o also reduces the time required for the model to process a specific number of tokens (in English text, one token is usually equal to one and a half words), which makes the model run faster and cheaper than the previous best model GPT-4 Turbo of OpenAI.
On Tuesday (May 14th), Google also launched a series of big moves, head-on targeting OpenAI.
At Google's I/O Developer Conference, Google announced a series of new artificial intelligence features and upcoming products, including extensive upgrades to the Gemini model, future AI assistant Astra, generative AI empowering Google search, and a range of generative AI tools related to images, music, and videos.
Google announced improvements to the Gemini 1.5 Pro model at the meeting, expanding the context window of 1 million tokens to 2 million and enabling it to have more natural sounds, better understanding of audio and images, stronger logical reasoning and planning abilities, and better computer code generation capabilities.
And Google has also released an advanced visual and dialogue responsive intelligent agent project Astra, which is used to process multimodal input content such as audio and video. Compared to OpenAI's GPT-4o, which can only handle static images, Astra can also handle videos. In a demonstration video, it is able to recognize commands such as "what can make sound" and "where are you now" through a camera video. However, its response is lagging or delayed, and it is reported that future versions of Google's artificial intelligence personal assistants are being developed through Astra.
The "Highlighting Moment" of Artificial Intelligence Assistant
From the product releases of OpenAI and Google, it can be seen that technology companies attach great importance to the research and development of artificial intelligence assistants, and the position of "the first AI killer application" has become a "must compete" place for various companies in Silicon Valley.
Based on this week's product releases, OpenAI and Google's artificial intelligence assistants each have their own advantages. GPT-4o can directly receive and generate speech, eliminating the process of converting speech into text; Astra, on the other hand, can handle dynamic images such as videos, which is a significant advantage.
The release of these two products clearly puts the other two giants in Silicon Valley, Apple and Amazon, at a disadvantage. They need to upgrade their voice assistants Siri and Alexa to keep up with the capabilities of these new competitors, otherwise these products will be in trouble. Based on the current known information, Amazon's investment in Aerospace has a powerful Claude AI model available for use; Previously, there were reports that Apple was in talks with OpenAI to obtain its technology license in the short term.
However, will these new artificial intelligence assistants be the future "killer applications" of artificial intelligence? This conclusion is currently inconclusive and depends entirely on what happens next.
From the current use cases of artificial intelligence assistants, they cannot be considered ubiquitous essential products in human daily life. Apart from translation functions, there is almost no one that can help people complete their work.
Analysis suggests that this situation may change when these assistants have more "proxy" attributes. If one day they can truly understand human personal preferences, complete tasks according to people's preferences, and help with some things in daily life (such as online shopping, filling out insurance forms, booking holidays, etc.), then artificial intelligence assistants are likely to become a "killer application".
Google currently states that it is developing such products, but has not provided a timeline for product release; OpenAI also continues to reveal that it is about to release exciting future announcements; Next week, Microsoft will hold a Build Developer Conference.

浏览过的版块