Accurate Sniper Before Nvidia's Financial Report? This unicorn is making a strong push into AI reasoning, achieving the world's fastest speed without using HBM
海角七号
发表于 2024-8-28 15:16:08
184
0
0
After local time on Wednesday, Nvidia is about to release its final heavyweight Q2 report for the entire secondary market, causing global investors to be highly nervous. On the previous day (August 27 local time), the US artificial intelligence processor chip Unicorn Cerebras Systems released what it called the world's fastest AI reasoning service based on its own chip computing system, claiming to be 10 to 20 times faster than the system built with Nvidia H100 GPU.
Currently, Nvidia GPUs dominate the market in both AI training and inference. Since launching its first AI chip in 2019, Cerebras has been focusing on selling AI chips and computing systems, dedicated to challenging Nvidia in the field of AI training.
According to a report by the American technology media The Information, OpenAI's revenue is expected to reach $3.4 billion this year thanks to its AI inference services. Since the cake of AI reasoning is so big, Andrew Feldman, co-founder and CEO of Cerebras, said that Cerebras also needs to occupy a place in the AI market.
Cerebras' launch of AI inference services not only opens up the AI chip and computing system, but also launches a comprehensive attack on Nvidia based on the second revenue curve of usage. Stealing enough market share from Nvidia to make them angry, "said Feldman.
Fast and cheap
Cerebras' AI inference services have shown significant advantages in both speed and cost. According to Feldman, measured by the number of tokens that can be output per second, Cerebras' AI inference speed is 20 times faster than AI inference services run by cloud service providers such as Microsoft Azure and Amazon AWS.
Feldman simultaneously launched the AI inference services of Cerebras and Amazon AWS at the press conference. Cerebras can instantly complete inference work and output, with a processing speed of 1832 tokens per second, while AWS takes a few seconds to complete the output, with a processing speed of only 93 tokens per second.
Feldman said that faster inference speed means that real-time interactive voice responses can be achieved, or by calling multiple rounds of results, more external sources, and longer documents, more accurate and relevant answers can be obtained, bringing a qualitative leap to AI inference.
In addition to its speed advantage, Cerebras also has a huge cost advantage. Feldman stated that Cerebras' AI inference service is 100 times more cost-effective than AWS and others. Taking the Llama 3.1 70B open-source large-scale language model running Meta as an example, the price of this service is only 60 cents per token, while the price of the same service provided by general cloud service providers is $2.90 per token.
56 times the current maximum GPU area
The reason why Cerebras' AI inference service is fast and cheap is due to the design of its WSE-3 chip. This is the third generation processor chip launched by Cerebras in March this year. Its size is enormous, almost equivalent to the entire surface of a 12 inch semiconductor chip, or larger than a book, with a single unit area of about 462.25 square centimeters. It is 56 times the current largest GPU area.
The WSE-3 chip does not use independent high bandwidth memory (HBM) that requires interface connection to access, as Nvidia does. On the contrary, it directly embeds memory into the chip.
Thanks to its chip size, the WSE-3 has an on-chip memory of up to 44GB, almost 900 times that of the Nvidia H100, and a memory bandwidth 7000 times that of the Nvidia H100.
Feldman stated that memory bandwidth is the fundamental factor limiting the inference performance of language models. And Cerebras integrates logic and memory into a giant chip, with huge on-chip memory and extremely high memory bandwidth, which can quickly process data and generate inference results. This is a speed that GPUs cannot achieve
In addition to its speed and cost advantages, the WSE-3 chip is also a double-edged sword for AI training and inference, with outstanding performance in handling various AI tasks.
According to the plan, Cerebras will establish AI inference data centers in multiple locations and charge for inference capabilities based on the number of requests. Meanwhile, Cerebras will also attempt to sell the CS-3 computing system based on WSE-3 to cloud service providers.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Nvidia's first two new AI hardware models make their debut before the release of its third quarter report
- Over 10000 Nvidia Blackwell chips have been delivered to Huang Renxun in response to tariff issues
- Financial analysis: Nvidia's Q4 performance guidance falls short of the highest expectations, and the stock price fell more than 5% after the market closed
- Global Finance: Market pays attention to Nvidia's performance. The three major stock indexes of the New York Stock Exchange fluctuated on the 20th
- Nvidia's Q4 performance guidance fell short of the highest expectations, and its stock price fell more than 5% after hours
- Nvidia's third quarter revenue reached $35.082 billion
- NVIDIA's performance growth slows down, Huang Renxun steps in to 'appease' the market! Analyst: Investors Underestimate Demand for Blackwell Chips
- Nvidia's Q4 performance guidance falls short of the highest expected stock price, with a drop of over 5% after the market closed
- The stock price has skyrocketed by 33%! Snowflakes overshadow Nvidia analysts: AI software outperforms semiconductors or trends
- The three major US stock indices collectively closed higher, while the Dow Jones Industrial Average rose more than 1%. Nvidia's stock price hit a new intraday high
-
11月21日、2024世界インターネット大会烏鎮サミットで、創業者、CEOの周源氏が大会デジタル教育フォーラムとインターネット企業家フォーラムでそれぞれ講演、発言したことを知っている。周源氏によると、デジタル教 ...
- 不正经的工程师
- 昨天 16:36
- 支持
- 反对
- 回复
- 收藏
-
アリババは、26億5000万ドルのドル建て優先無担保手形と170億元の人民元建て優先無担保手形の定価を発表した。ドル債の発行は2024年11月26日に終了する予定です。人民元債券の発行は2024年11月28日に終了する予定だ ...
- SOGO
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
スターバックスが中国事業の株式売却の可能性を検討していることが明らかになった。 11月21日、外国メディアによると、スターバックスは中国事業の株式売却を検討している。関係者によると、スターバックスは中国事 ...
- 献世八宝掌
- 前天 16:29
- 支持
- 反对
- 回复
- 收藏
-
【意法半導体CEO:中国市場は非常に重要で華虹と協力を展開】北京時間11月21日、意法半導体(STM.N)は投資家活動の現場で、同社が中国ウェハー代工場の華虹公司(688347.SH)と協力していると発表した。伊仏半導体 ...
- 黄俊琼
- 前天 14:29
- 支持
- 反对
- 回复
- 收藏