Meta confirms that the open-source big model LLaMA 3 will debut next month, and by the end of the year, a "supercomputing library" equivalent to 600000 H100 GPUs will be built | Big Model World - News - LogoMoeny - Us stocks at the forefront

After nearly a year of launching the open-source big model LLaMA 2, Meta's new generation big model LLaMA 3 is about to be released.
At an event held in London on April 9th, Meta confirmed plans to release LLaMA 3 for the first time next month. It is said that the model will have multiple versions with different functionalities.
But Meta did not disclose the parameter size of LLaMA 3. "Over time, our goal is to make Meta AI driven by LLaMA the most useful assistant in the world," said Joelle Pineau, Vice President of Meta Artificial Intelligence Research. "There is still a considerable amount of work to be done to achieve this goal."
According to a report released by technology foreign media The Information on April 8th, as a benchmark model for GPT-4, the large-scale version of LLaMA 3 may have a parameter count of over 140 billion, while the largest LLaMA 2 version has a parameter count of 70 billion. LLaMA 3 will support multimodal processing, which means simultaneously understanding and generating text and images.
It is worth noting that LLaMA 3 will continue Meta's long-standing open source approach. At present, the competition in the open source model industry is becoming increasingly fierce, and open source big models are also evolving to be more and more powerful. Up to now, many companies including Google, Musk's xAI, Mistral AI, StabilityAI, etc. have released open-source big models.
As a leader in the open source modeling industry, Meta's investment in AI infrastructure cannot be underestimated, and currently only Microsoft has a comparable reserve of computing power. According to a technical blog post by Meta, by the end of 2024, the company will purchase an additional 350000 Nvidia H100 GPUs, which, including other GPUs, will have computing power equivalent to nearly 600000 H100 units.
Just next month! LLaMA 3 is about to debut
The number of parameters may reach 140 billion
At an event held in London on April 9th, Meta confirmed plans to release LLaMA 3 for the first time next month. "We hope to launch the new next-generation foundational model suite LLaMA 3 next month, even in a very short period of time," said Nick Clegg, President of Global Affairs at Meta
From Clegg's statement, LLaMA 3 will have multiple versions with different functionalities. "Within this year, we will release a series of models with different functionalities and versatility, and we will start releasing them soon."
Meanwhile, Meta's Chief Product Officer, Chris Cox, added that Meta plans to use LLaMA 3 to support multiple products of Meta.
It is worth noting that LLaMA 3 will continue Meta's long-standing open source approach.
Unlike OpenAI's adherence to a closed source approach and large parameter LLM, Meta chose an open source strategy and miniaturized LLM from the beginning.
In February 2023, Meta publicly released the LLaMA big model on its official website. Similar to the GPT series models, LLaMA is also an autoregressive language model built on the Transformer infrastructure.
LLaMA includes four parameter scales of 7 billion, 13 billion, 33 billion, and 65 billion, aiming to promote the miniaturization and democratization of LLM research. In contrast, the GPT-3 reached a parameter scale of 175 billion at its highest. Meta summarized in the paper at the time that although the volume was more than 10 times smaller, the performance of LLaMA (13 billion parameters) was superior to GPT-3.
Generally speaking, smaller models have lower costs, run faster, and are easier to fine tune. As Meta CEO Zuckerberg stated in a previous earnings conference call, open source models are often safer, more efficient, and more cost-effective to run, constantly subject to community scrutiny and development.
When it comes to open source issues, Zuckerberg also said in an interview with foreign media The Verge, "I tend to believe that one of the biggest challenges is that if what you create is really valuable, it will eventually become very focused and narrow. If you make it more open, then you can solve a lot of the problems that inequality of opportunity and value may bring. Therefore, this is an important component of the entire open source vision."
In addition, small models also make it easy for developers to develop AI software on mobile devices, which is why the LLaMA series models have received widespread attention from developers since they were open-source. Currently, many models on Github are developed based on the LLaMA series models.
By July last year, Meta had released LLaMA 2 again. At that time, Meta also adopted a strategy of starting with small models. Before releasing the large-scale version of LLaMA 2 with 70 billion parameters, Meta first released small versions with 13 billion and 7 billion parameters.
However, according to relevant tests, LLaMA 2 refuses to answer some less controversial questions, such as how to prank friends or how to "kill" car engines. In recent months, Meta has been working to make LLaMA 3 more open and accurate in answering controversial questions.
Although Meta did not disclose the parameter size of LLaMA 3, according to The Information report, as a benchmark model for GPT-4, the large-scale version of LLaMA 3 has a parameter size of over 140 billion, which is twice that of the largest version of LLaMA 2.
In the entire open source model industry, competition is becoming increasingly fierce, and open source big models are also evolving to become stronger.
In February of this year, Google made a rare change from last year's insistence on a closed source strategy for large models and launched the open source large model Gemma; In March, Musk also opened up the Grok-1 model of his xAI company. According to the performance testing documents of Gemma and Grok-1, their performance in multiple benchmark tests such as mathematics, reasoning, and code has exceeded that of LLaMA 2 models of the same scale.
As of now, multiple technology companies including Google, xAI, Mistral AI, DataBricks, and StabilityAI have released open-source big models. An industry insider previously said in an interview with the Daily Economic News, "Open source is the trend, and I believe Meta is leading this trend, followed by smaller companies such as Mistral AI and HuggingFace."
Crazy AGI: Spending $10 billion to hoard chips
By the end of the year, the computing power will be equivalent to approximately 600000 H100
As a leader in the open source modeling industry, Meta's investment in AI infrastructure cannot be underestimated.
In fact, Meta posted a technology blog last month showcasing its computing resources, as well as the details and roadmap for laying out AI infrastructure. The company stated that its long-term vision is to build an open and responsible General Artificial Intelligence (AGI) that can be widely used and benefit everyone.
Meta wrote in her blog, "By the end of 2024, our goal is to continue expanding (AI) infrastructure construction, including 350000 Nvidia H100 GPUs, which is part of its product portfolio. Including others, its computing power is equivalent to nearly 600000 H100." It is reported that currently only Microsoft has a comparable reserve of computing power. According to the price given on Amazon, a single H100 chip costs approximately $30000, and the price of 350000 H100s is $10.5 billion (approximately 76 billion RMB).
In this document, Meta also revealed some cluster details for training LLaMA 3, which consists of 24576 Nvidia H100 GPUs.
According to a report released by market tracking company Omdia last year, Meta and Microsoft are the largest buyers of Nvidia H100 GPUs. According to its estimation, the two companies mentioned above each purchased up to 150000 H100 GPUs in 2023, which is more than three times the number of H100 GPUs purchased by technology companies such as Google, Amazon, and Oracle.
In the aforementioned document, Meta also reiterated its commitment to the consistent open source approach, "Meta has always been committed to open innovation in artificial intelligence software and hardware. We believe that open source hardware and software will always be valuable tools to help the industry solve problems on a large scale."
It is worth mentioning that with his investment in AI, Zuckerberg ranks fourth on Forbes' latest 2024 (38th) Global Billionaires List, with a net asset value of $177 billion, which is also the highest ranking Zuckerberg has ever recorded. In US dollars, Zuckerberg's net asset value has grown the most in the past year, with a total increase of 112.6 billion US dollars, a growth rate of 174.8%.

Meta confirms that the open-source big model LLaMA 3 will debut next month, and by the end of the year, a "supercomputing library" equivalent to 600000 H100 GPUs will be built | Big Model World

浏览过的版块