首页 News 正文

At the unprecedented rise of OpenAI, Google resolutely launched a desperate counterattack.
On December 6th local time, Google announced the launch of its largest and most powerful new large-scale language model Gemini, its most powerful TPU (Tensor Processing Unit) system "Cloud TPU v5p", and an artificial intelligence supercomputer from Google Cloud. V5p is an updated version of Cloud TPU v5e, which was fully launched earlier this year. Google has promised that its speed is significantly faster than v4 TPU.
It is worth mentioning that in the MMLU (Massive Multi Task Language Understanding) test, Gemini Ultra surpassed human experts for the first time with a high score of 90.0%.
Gemini's various abilities
According to Interface News on December 7th, Gemini 1.0 is the true competitor of GPT4 that Google has been preparing for for a year. It is also the most powerful and flexible large model that Google can currently offer, including three different packages: Gemini Ultra, Gemini Pro, and Gemini Nano.
Among them, Ultra has the strongest ability, the highest complexity, and can handle the most difficult multimodal tasks; Pro has slightly weaker capabilities and is a model that can scale up to multiple tasks; Nano is a model that can run on the mobile side. This indicates that Gemini has a wide reach range, which can be extended down to the data center or up to the mobile device side.
The Gemini model, trained with massive data, can recognize and understand text, images, audio, and other content well, and can answer complex topic related questions. So, I am very good at explaining reasoning tasks in complex disciplines such as mathematics and physics.
Gemini can generate and understand mainstream code such as Python, Java, C++, and Go. Gemini Ultra has performed well in multiple coding benchmark tests, including HumanEval, which is an important industry standard for evaluating coding task performance.
Google has also developed a professional code model, AlphaCode 2, based on the Gemini model. Compared to the previous generation, AlphaCode 2 has improved performance by at least 50%.
Gemini's multimodal capabilities enable it to have strong capabilities in visual understanding, text generation, and other areas. For example, sorting out important viewpoints from hundreds of thousands of words of novels and finding the most valuable content from 200 pages of financial reports. This is of great help to research and business personnel in finance, technology, and healthcare.
In a publicly released demonstration video, Sandar Pichai demonstrated Gemini's extraordinary recognition ability for videos and images. In the video, Gemini seamlessly transitions between various modes of image, audio, and video, demonstrating astonishing potential for unlocking application scenarios and product forms.
Google Demo Video

Based solely on the demonstration video released by Google, all existing multimodal large models on the market have intergenerational performance differences with Gemini, including Meta's open-source AI model ImageBind across six modalities and GPT-4 in May.
Google

A year ago, after the AI development agency OpenAI released the chatbot ChatGPT, Google, which created most of the basic technologies behind the current AI boom, was caught off guard and issued an internal "red code". A year and a week later, Google seems ready to counterattack.
According to The Paper, Demis Hassabis, CEO of Google DeepMind and representative of the Gemini team, spoke positively at a press conference about the comparison between GPT-4 and Gemini. "We conducted a very thorough analysis of the system and conducted benchmark testing. Google ran 32 comprehensive benchmark tests to compare these two models, from extensive overall testing (such as multitasking language comprehension benchmark testing) to comparing the ability of two models to generate Python code." Hassabis said with a slight smile, "I think we are significantly ahead of 30 out of 32 benchmarks."
From the release date, Gemini can be applied to Bard and Pixel 8 Pro smartphones and will soon integrate with other products in Google services, including Chrome, search, and advertising.
Currently, Google plans to license Gemini to customers through Google Cloud for use in their own applications. Starting from December 13th, developers and enterprise clients can access Gemini Pro through the Gemini API (Application Programming Interface) in Google AI Studio or Google Cloud Vertex AI, and Android developers can use Gemini Nano to complete the build.
It is reported that Gemini Ultra is the first model to surpass human experts in MMLU (Massive Multi tasking Language Understanding), which integrates 57 subjects including mathematics, physics, history, law, medicine, and ethics to test world knowledge and problem-solving abilities. Gu Ge stated in a blog article that it can understand subtle differences and reasoning in complex topics.
According to CNBC, Google executives stated at a press conference that the Gemini Pro performs better than the GPT-3.5, but avoided the question of how it compares to the GPT-4. Regarding whether Google plans to charge for access to Bard Advanced, Bard General Manager Sissie Hsiao stated that Google is focused on creating a good experience and currently has no relevant profit details.
Google's Strongest TPU and AI Supercomputers
Along with the new model, there is also a new version of the TPU chip TPU v5p, aimed at reducing the time investment related to training large language models. TPU is a specialized chip designed by Google for neural networks, which has been optimized to accelerate the training and inference speed of machine learning models. Google began launching the first generation of TPU in 2016.
According to Google, compared to TPU v4, TPU v5p has doubled floating-point performance and tripled in high bandwidth memory. By using Google's 600 GB/s chip interconnection, 8960 v5p accelerators can be coupled into a Pod (usually a cluster or module containing multiple chips) to train models faster or more accurately. As a reference, this value is 35 times larger than TPU v5e and more than twice that of TPU v4.
Google claims that TPU v5p is its most powerful to date, providing bfloat16 (16 bit floating-point format) performance of 459 teraFLOPS (capable of performing 459 trillion floating-point operations per second) or Int8 (performing 8-bit integers) performance of 918 teraOPS (capable of performing 918 trillion integer operations per second), supporting 95GB of high bandwidth memory, and capable of transmitting data at a speed of 2.76 TB/s.
Google stated that all of this means that TPU v5p can train large language models faster than TPU v4, such as training GPT-3 (175 billion parameters), which is 2.8 times faster than TPU v4.
In addition to new hardware, Google has also introduced the concept of "artificial intelligence supercomputers". Gu Geyun describes it as a supercomputing architecture that includes an integrated system with open software, performance optimized hardware, machine learning frameworks, and flexible consumption models.
Mark Lohmeyer, Vice President of Google Computing and Machine Learning Infrastructure, explained in a blog post that "traditional methods typically solve demanding artificial intelligence workloads through fragmented component level enhancements, which can lead to inefficiencies and bottlenecks." "In contrast, artificial intelligence supercomputers adopt system level collaborative design to improve the efficiency and productivity of artificial intelligence training, tuning, and services." This can be understood as a combination that improves productivity and efficiency compared to treating each part separately. In other words, a supercomputer is a system in which any variable (hardware or software) that may lead to poor performance is controlled and optimized.
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

胡胡胡美丽_ss 注册会员
  • 粉丝

    0

  • 关注

    0

  • 主题

    34