Nvidia AI chip H200 debuts: performance nearly doubles and will be launched in the second quarter of next year

After 19 months, Nvidia upgraded the AI chip star product H100 and launched the H200 chip. Due to the high prices and significant profits of AI chips, which will play a crucial role in the development of generative AI, the release of H200 has attracted dual attention from the technology and financial markets.
On November 13th local time, at the International Supercomputing Conference SC23, Nvidia released the new generation GPU chip H200 for data centers. This chip adopts the latest storage technology HBM3e and exhibits unprecedented performance on multiple popular generative AI models.
In the first three quarters of 2023, the H100, the previous generation product of H200, became the popular hardware in the technology market, generating tens of billions of dollars in revenue and possibly a profit margin of 1000% for Nvidia. H200 will be sold globally starting from the second quarter of 2024 and may become the next best-selling chip, continuing to be a key driver of NVIDIA's performance growth.
Since November, against the backdrop of the Federal Reserve's temporary interest rate hike, Nvidia's stock price has continued to rise. Based on the closing price of $486.2 per share on November 13th, Nvidia currently has a market value of $1.2 trillion.
H200 chip performance major upgrade
H200 is a new GPU chip and one of the core hardware used for AI work today. This technological upgrade reflects that the current AI chip market mainly competes around storage technology. In addition, in addition to training ability, reasoning ability is also becoming increasingly important.
The storage technology used by H200 is HBM3e, a high bandwidth storage technology that was introduced in May 2023. At that time, South Korean storage giant SK Hynix took the lead in announcing this technology and planned to put it into large-scale production in the first half of 2024. On July 26th and October 20th, two other major storage companies, Micron and Samsung, also announced the production of HBM3e products. Nevertheless, this storage technology is still monopolized by the three giants and is subject to production constraints on the manufacturing side.
Storage technology is becoming a key factor in upgrading AI chips. At the SC23 meeting, Ian Buck, Vice President of NVIDIA, bluntly stated, "If you want to handle the large amount of data and high-performance computing applications of generative AI, high-speed and high-capacity GPU storage is a necessary configuration
After being equipped with HBM3e, the storage capacity of H200 has been upgraded to 141GB and the bandwidth has been upgraded to 4.8TB per second. However, the previous generation H100 had a maximum storage capacity of 80GB and a bandwidth of 3.35TB per second.
In addition, Shen Wanhongyuan stated in an industry review on November 14th that H200 has undergone a software upgrade compared to H100, significantly enhancing inference functions and HPC performance, significantly reducing energy consumption and overall costs.
This type of chip, which is filled with "high precision and cutting-edge" technology, is currently particularly popular in the AI large model development market. Nvidia conducted a demonstration using some well-known AI models that are popular in the market, and the results showed that on the Llama 2 model with a parameter quantity of 70 billion, H200's inference speed is 90% faster than H100; On the GPT-3 model with 175 billion parameters, H200 has a 60% faster inference speed than H100.
Technically, H200 will be able to fully replace H100. At the same time, whether in the server or in another superchip GH200, H200 can replace H100.
Nvidia also announced that Amazon Cloud AWS, Google Cloud, Microsoft Azure Cloud, and Oracle Cloud will become the first cloud service providers to use H200 starting from 2024.
Creating the Next 'Profit Bull'
Generative AI is promoting the global development of AI models, and high-performance AI chips have become a "hot spot" in the semiconductor market. The H100 chip is one of the most closely watched AI chips in 2023, with high prices and supply exceeding demand, making it a "profit bull" for Nvidia. Will H200 inherit the glory of H100?
After hours on November 21st local time in the United States, NVIDIA will release its financial reports for the third quarter of the 2024 fiscal year from August to October. According to the previous official forecast of NVIDIA, revenue for this quarter will reach approximately $16 billion, with a year-on-year increase of up to 170%. In addition, the adjusted gross profit margin under the non GAAP standard will reach 72.5%, reflecting a further increase in product prices during the period.
In this financial report, H100's performance is crucial. The H100 was unveiled on March 22, 2022, and is the ninth generation data center GPU designed by Nvidia. Since its mass production and launch in September 2022, this chip has become one of the most popular AI training chips on the market.
Over the past four quarters, the H100 chip has been regarded as the "pillar" of Nvidia's revenue and profits. Driven by it, Nvidia has taken advantage of the development wave of generative AI, withstood the downward cycle of the semiconductor industry, and its performance has grown against the trend.
Previously, Raymond James, a US financial institution, stated in a report that the cost of the H100 chip was only $3320, but Nvidia's bulk price to its customers was as high as $25000 to $30000, resulting in a profit margin of up to 1000% for the H100 chip, which is also Nvidia's "most profitable" chip in history.
According to information leaked to the media by insiders of NVIDIA and TSMC, NVIDIA will deliver approximately 550000 H100 chips worldwide in 2023. If calculated based on the median selling price of $27500, Nvidia may generate $15.125 billion in sales revenue from the H100 product in 2023.
In NVIDIA's financial reports, data centers and gaming have always been the two pillar businesses. The data center sector, including AI chips, has surpassed the game in terms of its contribution to performance over the past year.
In the first and second quarters of Nvidia's 2024 fiscal year, data center business revenue was $4.284 billion and $10.32 billion. Data center products include GPUs such as H100 and A100, as well as DGX systematic solutions based on these GPUs. In addition, different types of data center chip products such as Grace CPU and Bluefield DPU are also included.
Who is getting stuck around Nvidia's neck?
Although the release of H200 demonstrates Nvidia's determination to continue to "dominate" the market, multiple factors are still a cloud above the development of H200. Among them, the production capacity limitations on the advanced packaging end and HBM3e supply end may lead to a continued shortage of H200 in the market.
After the release of H200, several securities firms such as China Merchants Electronics called on investors to pay attention to the global HBM and advanced packaging industry chain, as these two markets face a shortage of production capacity and will become the "bottleneck" for H200's future shipment volume.
On August 21st, SK Hynix announced that it had provided samples of HBM3e DRAM chips to NVIDIA; In October, Meguiar revealed that it was having NVIDIA verify its HBM3e product. This reflects that the two giants are striving to make appearances on the most advanced AI chips.
However, currently only SK, Micron, and Samsung are competing in the HBM market, which has resulted in high prices for related storage products despite limited supply. According to a report released by TrendForce in August, SK's global market share in the HBM market is expected to be 46% to 49% in 2023, with Samsung on par, and the remaining 4% to 6% of the market will be occupied by Meguiar. In addition to NVIDIA, giants such as AMD, Amazon, and Microsoft are also bidding for HBM3e, which is not widely available in the market.
At the advanced packaging end, TSMC's CoWoS packaging technology is crucial, which is a 2.5D packaging technology and also the packaging technology adopted by H100. In late July, TSMC President Wei Zhejia stated that although the front-end (manufacturing) production capacity is sufficient, the back-end (i.e. advanced packaging) production capacity is "relatively tight" and is expected to release new production capacity by the end of 2024. In November, market news pointed out that in response to the needs of multiple customers such as NVIDIA, AMD, Broadcom, and Maywell, TSMC will increase its CoWoS packaging capacity to 35000 units in 2024, an increase of 120% compared to 2023.
However, the delivery period of CoWoS equipment is still as long as 8 months, which will result in TSMC's related packaging capacity continuing to be tight in the first half of 2024.
The release of H200 will also drive multiple other links in the chip industry chain. On November 14th, China Merchants Electronics stated in its market review that it should pay attention to server ODM manufacturers, storage, PCBIC carrier boards, connectors and cables, heat dissipation, power supply, analog chips, interface chips, RAID cards, power devices and other related manufacturers in the NVIDIA industry chain. In addition, although there is a significant gap between domestic computing power and international level in the short term, it is recommended to continue to pay attention to domestic GPU and CPU manufacturers and companies related to the independent computing power industry chain.

浏览过的版块