Google's Counterattack on OpenAI Raid: Release Generative AI Version of Search Engine and Other Big Models "Family Bucket"
寒香小凡瓤
发表于 2024-5-15 11:15:39
1242
0
0
On the second day of OpenAI's spring press conference, Google faced off with the new I/O Developer Conference.
This event has been full of gunpowder since 1am Beijing time on May 15th. Google chose to "announce everything" at the meeting: it has continuously released and updated more than ten products, including the AI assistant Astra, the cultural image model Imagen3, the cultural video model Veo benchmarking Sora, and the highly anticipated flagship model Gemini.
When OpenAI lost its search engine and instead launched the latest flagship model GPT-4o, Google, which had long dominated search, not only redesigned AI search but also simultaneously launched an AI image recognition assistant.
Gemini's new voice dialogue function Live directly benchmarks OpenAI's GPT-4o, and can also inquire about the surrounding situation in real time through the phone, even if the conversation is interrupted, it can be followed up in a timely manner.
In addition, Google Chrome will add Gemini Nano. The latter is a lightweight version of the Gemini series, mainly designed for mobile devices.
Google also stated that another small model, Gemma 2.0, will be launched this summer, including the open-source model PaliGemma, which can be used to tag photos and add titles to images. The Gemma model adopts the same technology stack as the Gemini model, but with a smaller scale, it is suitable for deployment in resource constrained environments.
To a large extent, the artificial intelligence competition is also a competition for smartphones. Google's Vice President of Product Management, Sameer Samat, has clearly stated that Google will further optimize the Android operating system through Gemini. This optimization will first be reflected on Google's own phone Pixel.
Gemini is clearly the protagonist of this press conference, with multimodal and long context technologies being particularly emphasized.
In the past few months, Google has launched the Gemini 1.5 Pro, which can provide long context previews and has made a series of improvements in translation, encoding, and inference. At present, the context length of Gemini 1.5 Pro has been refreshed from 1 million tokens (the basic unit of text processing) to 2 million tokens, doubling in three months, indicating that the company is eager to show off its muscles to the outside world.
At this time, Gemini has been around for a year, and this multimodal large model can now infer across text, images, videos, code, and more. According to Google, 2 billion users and over 1.5 million developers are using the Gemini model, which can be used to debug code, gain new insights, and build next-generation artificial intelligence applications.
In order to further demonstrate the various features of the model, Google has provided more detailed introductions for different scenarios such as search, photos, and Android systems.
For example, in terms of search, Gemini has brought comprehensive AI transformation to it. Users can ask updated, longer, and more complex questions for queries, and even use photos for searching. Google plans to launch the "AI Overview" search in the United States starting this week, and it will be launched in other countries in the future.
Google showcased the feature of "asking for photos" on site. When users pay in the parking lot but forget their license plate number, they may usually search for keywords in their phone photos and browse through a large number of past photos to find the license plate. But now, simply asking for photos can accurately indicate frequently seen cars, perform triangulation on vehicles, and provide license plate numbers.
For example, you can ask the photo when your child learned to swim, or even let the photo tell you how their child's swimming progress is.
Gemini is not only a chatbot, but also a personal assistant that can help users handle complex tasks and take action. Gemini 1.5 Pro has also been introduced into Google's cloud computing service, Google Work Space. Google claims that Gemini can complete all the necessary steps for work. Taking returns as an example, AI can search for receipts in emails, find the corresponding order number, automatically fill out the return form, and arrange for pickup.
A big model is a computing power competition, and training the most advanced models requires a lot of computing power. In the past six years, the demand for machine learning computing in the industry has increased by one million times, and it is increasing tenfold every year. As an important participant in the AI era, Google has also made significant efforts in infrastructure.
That night, Google released the sixth generation TPU (an application specific integrated circuit designed by Google to accelerate machine learning workloads) - Trillium, and called it its highest performing and efficient TPU to date. Compared with the previous generation TPU v5e, the computing performance of each chip has increased by 4.7 times, and it is planned to be available to customers by the end of this year.
Gemini is fully trained and serviced on Google's self-developed fourth and fifth generation TPUs, and other leading artificial intelligence companies, including Antihop, have also trained their models on TPUs.
But while Google injects AI functionality into its various products, it means that users need to make more concessions on their personal privacy data. Google promises not to use user files on its platform to train Gemini or other artificial intelligence models.
Google CEO Pichai stated that the 121 mentions of "AI" at the press conference that day are enough to demonstrate the importance of AI to Google. But apart from emphasizing its importance, this expected counterattack against OpenAI did not bring any greater surprises.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Apple lowers prices of various iPhone models in India
- Baidu Shen Dou: Upgrade computing platform capability for 100000 card computing power cluster, Wenxin large model daily usage exceeds 700 million times
- Meta releases heavyweight new products: $299 Quest 3S headset, AR glasses prototype, multimodal AI model
- Baidu World 2024 will be held on November 12th, and the daily average number of adjustments for the Wenxin large model has exceeded 700 million times
- 挑战Model Y 蔚来的品牌下沉“阳谋”
- Ford CEO tired of making 'boring' car models, personalized and electrified products become 'new favorites'
- Dialogue | Baidu Li Tao: The overlap between automotive intelligence and the wave of big models is a historical inevitability
- Boeing announces 10% layoffs, first delivery of 777X model postponed to 2026
- Faraday Future plans to launch the first model of its second brand by the end of next year
- Will a third brand launch hybrid models overseas? NIO responds: Continuing the pure electric technology route
-
【英偉達の需要が高すぎる?SKハイニックス:黄仁勲がHBM 4チップの6カ月前納入を要求!】SKハイニックスの崔泰源(チェ・テウォン)会長は月曜日、インビダーの黄仁勲(ファン・インフン)CEOが同社の次世代高帯域 ...
- 琳271
- 前天 17:54
- 支持
- 反对
- 回复
- 收藏
-
ファイザーが前立腺がんを治療する革新薬テゼナ& ;reg;(TALZENNA®,一般名:トルエンスルホン酸タラゾールパーリカプセル)は2024年10月29日に国家薬品監督管理局(NMPA)の承認を得て、HRR遺伝子突然変異 ...
- 什么大师特
- 昨天 17:41
- 支持
- 反对
- 回复
- 收藏
-
南方財経は11月5日、中央テレビのニュースによると、現地時間11月5日、米ボーイング社のストライキ労働者が59%の投票結果で新たな賃金協定を受け入れ、7週間にわたるストライキを終えた。ストライキ労働者は11月12 ...
- Dubssgshbsbdhd
- 昨天 16:27
- 支持
- 反对
- 回复
- 收藏
-
【マスクはテスラが携帯電話を作ることに応えた:作れるが作らないアップルとグーグルが悪さをしない限り】現地時間11月5日、有名ポッドキャストのジョローガン氏のインタビューに応じ、「携帯電話を作るのは私たち ...
- 波大老师
- 昨天 14:41
- 支持
- 反对
- 回复
- 收藏