Nvidia Cup B200 Chip: Moore's Law Fails, Multi Card Interconnection Wins the King
Ty奇葩罗牛山831
发表于 2024-3-19 21:39:12
1205
0
0
On the early morning of March 19th Beijing time, at the NVIDIA GTC (GPU Technology Conference), NVIDIA CEO Huang Renxun announced the successor of Hopper architecture chips - Blackwell architecture B200 chips. At present, there is a high demand for Nvidia Hopper architecture chips H100 and GH200 Grace Hopper superchips, providing computing power for many of the world's most powerful supercomputing centers, while B200 will provide further intergenerational leap in computing power.
The B200 chip of Blackwell architecture is not a traditional single GPU. On the contrary, it consists of two tightly coupled chips, although according to Nvidia, they do act as a unified CUDA GPU. These two chips are connected through a 10 TB/s NV-HBI (Nvidia High Bandwidth Interface) connection to ensure that they can function as a single, completely identical chip.
Multi card interconnection is the key to improving B200 computing power. The GB200, which combines two GPUs with a single Grace CPU, can provide 30 times the performance for inference work in large language models while potentially significantly improving efficiency. Nvidia claims that compared to H100, B200 can reduce the computational cost and energy consumption of generative AI by up to 25 times.
The improvement of NVIDIA AI chip performance in terms of computing power mainly relies on data accuracy. From FP64, FP32, FP16, FP8 to the current B200 chip FP4, the maximum theoretical computational cost of FP4 is 20 petaflops (data accuracy unit). FP4 is twice the performance of FP8, and the advantage of FP4 is that it increases bandwidth by using 4 bits instead of 8 bits for each neuron, doubling computation, bandwidth, and model size. If B200 is converted to FP8 and compared with H100 in the same category, theoretically B200 only provides 2.5 times more computing power than H100, and a large part of the computing power improvement of B200 comes from the interconnection of the two chips.
The Moore's Law of the CPU era (the number of transistors that can be accommodated on an integrated circuit doubles approximately every 18 months) has entered its twilight years. TSMC's breakthrough in the 3nm process has not brought about a generational improvement in chip performance. In September 2023, the Apple A17 Pro was launched, using TSMC's first 3nm process chip, but with only a 10% improvement in CPU performance. Moreover, the development of advanced process chips is costly. According to the Far East Research Institute, TSMC's wafer foundry prices in 2023 have increased by approximately 16% (advanced process) to 34% (mature process) compared to two years ago.
Besides Apple, another major chip customer of TSMC is NVIDIA - NVIDIA's hard currency AI chip H100 adopts TSMC's N4 (5nm) process and utilizes TSMC's advanced CoWoS packaging capacity.
Moore's Law is invalid, and Huang Renxun's Huang's Law states that the efficiency of GPUs will more than double every two years, and innovation is not just about chips, but the entire stack.
Nvidia continues to move towards multi card interconnection. Since the improvement of 3nm chips is limited, Nvidia's B200 chooses to place two 4nm chips side by side and form a super large chip with over 200 billion transistors through high-speed on-chip interconnection. At NVIDIA GTC, Huang Renxun briefly mentioned the performance of the chip itself, with a focus on the DGX system.
In terms of multi card interconnection, Nvidia's NVLink and NVSwitch technologies are its moat. NVLINK is a peer-to-peer high-speed interconnect technology that can directly connect multiple GPUs to form a high-performance computing cluster or deep learning system. In addition, NVLink introduces the concept of unified memory, supporting memory pools between connected GPUs, which is a crucial feature for tasks that require large datasets.
NVSwitch is a high-speed switch technology that can directly connect multiple GPUs and CPUs to form a high-performance computing system.
With the support of NVLink Switch, Nvidia miraculously connected 72 B200s together, ultimately becoming the "new generation computing unit" GB200 NVL72. A "computing unit" cabinet like this has an FP8 precision training computing power of up to 720 PFlops, approaching a DGX SuperPod supercomputer cluster (1000 PFlops) in the H100 era.
Nvidia has revealed that this brand new chip will be launched later in 2024. Currently, Amazon, Dell, Google, Meta, Microsoft, OpenAI, and Tesla have all planned to use Blackwell GPUs.
The method of packaging and wholesale card sales also meets the card usage needs of large model companies. Packaging multiple GPUs together into a data center is more in line with the purchasing methods of large model companies and cloud service providers. According to Nvidia's 2023 financial report, 40% of Nvidia's data center business revenue comes from large-scale data centers and cloud service providers.
As of the closing of the US stock market on March 18th Eastern Time, Nvidia's stock price was $884.550, with a total market value of $2.21 trillion.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Suspected of corruption NetEase games shocked Internet giants and the anti-corruption tide continued
- How will 'Trump 2.0' affect Nvidia? Wall Street consensus: More is good!
- Huang Renxun makes a major announcement! Nvidia and SoftBank collaborate, SoftBank accelerates AI layout
- Wall Street raises Nvidia target price one after another: Blackwell craze is coming!
- Big Internet factories cut jobs in succession at the end of the year? What is the truth?
- Germany heavily subsidizes chip industry, Amazon challenges Nvidia
- CITIC Securities: scientific and technological progress+policy expectations, both offensive and defensive in the Internet sector
- CITIC Securities looks forward to the U.S. stock investment strategy in 2025: continue to allocate front-line Internet giants, and simultaneously favor advertising technology, financial technology and other fields
- Nvidia Tmall flagship store clears and removes all products? Insiders: Tmall flagship store has always been used only for displaying products and does not directly sell them. There is no such thing as "delisting"
- NVIDIA Thor chip may experience further delay! Is Xiaopeng Motors considering shelving its installation? New forces are accelerating the development of self-developed chips
-
隔夜株式市場 世界の主要指数は金曜日に多くが下落し、最新のインフレデータが減速の兆しを示したおかげで、米株3大指数は大幅に回復し、いずれも1%超上昇した。 金曜日に発表されたデータによると、米国の11月のPC ...
- SNT
- 前天 12:48
- 支持
- 反对
- 回复
- 收藏
-
長年にわたって、昔の消金大手の捷信消金の再編がようやく地に着いた。 天津銀行の発表によると、同行は京東傘下の2社、対外貿易信託などと捷信消金再編に参加する。再編が完了すると、京東の持ち株比率は65%に達し ...
- SNT
- 前天 12:09
- 支持
- 反对
- 回复
- 收藏
-
【GPT-5屋台で大きな問題:数億ドルを燃やした後、OpenAIは牛が吹くのが早いことを発見した】OpenAIのGPT-5プロジェクト(Orion)はすでに18カ月を超える準備をしており、関係者によると、このプロジェクトは現在進 ...
- SNT
- 1 小时前
- 支持
- 反对
- 回复
- 收藏
-
【ビットコインが飛び込む!32万人超の爆倉】データによると、過去24時間で世界には32万7000人以上の爆倉があり、爆倉の総額は10億ドルを超えた。
- 断翅小蝶腥
- 3 天前
- 支持
- 反对
- 回复
- 收藏