Meta confirms that the open-source big model LLaMA 3 will debut next month, and by the end of the year, a "supercomputing library" equivalent to 600000 H100 GPUs will be built | Big Model World
楚一帆
发表于 2024-4-10 12:42:21
221
0
0
After nearly a year of launching the open-source big model LLaMA 2, Meta's new generation big model LLaMA 3 is about to be released.
At an event held in London on April 9th, Meta confirmed plans to release LLaMA 3 for the first time next month. It is said that the model will have multiple versions with different functionalities.
But Meta did not disclose the parameter size of LLaMA 3. "Over time, our goal is to make Meta AI driven by LLaMA the most useful assistant in the world," said Joelle Pineau, Vice President of Meta Artificial Intelligence Research. "There is still a considerable amount of work to be done to achieve this goal."
According to a report released by technology foreign media The Information on April 8th, as a benchmark model for GPT-4, the large-scale version of LLaMA 3 may have a parameter count of over 140 billion, while the largest LLaMA 2 version has a parameter count of 70 billion. LLaMA 3 will support multimodal processing, which means simultaneously understanding and generating text and images.
It is worth noting that LLaMA 3 will continue Meta's long-standing open source approach. At present, the competition in the open source model industry is becoming increasingly fierce, and open source big models are also evolving to be more and more powerful. Up to now, many companies including Google, Musk's xAI, Mistral AI, StabilityAI, etc. have released open-source big models.
As a leader in the open source modeling industry, Meta's investment in AI infrastructure cannot be underestimated, and currently only Microsoft has a comparable reserve of computing power. According to a technical blog post by Meta, by the end of 2024, the company will purchase an additional 350000 Nvidia H100 GPUs, which, including other GPUs, will have computing power equivalent to nearly 600000 H100 units.
Just next month! LLaMA 3 is about to debut
The number of parameters may reach 140 billion
At an event held in London on April 9th, Meta confirmed plans to release LLaMA 3 for the first time next month. "We hope to launch the new next-generation foundational model suite LLaMA 3 next month, even in a very short period of time," said Nick Clegg, President of Global Affairs at Meta
From Clegg's statement, LLaMA 3 will have multiple versions with different functionalities. "Within this year, we will release a series of models with different functionalities and versatility, and we will start releasing them soon."
Meanwhile, Meta's Chief Product Officer, Chris Cox, added that Meta plans to use LLaMA 3 to support multiple products of Meta.
It is worth noting that LLaMA 3 will continue Meta's long-standing open source approach.
Unlike OpenAI's adherence to a closed source approach and large parameter LLM, Meta chose an open source strategy and miniaturized LLM from the beginning.
In February 2023, Meta publicly released the LLaMA big model on its official website. Similar to the GPT series models, LLaMA is also an autoregressive language model built on the Transformer infrastructure.
LLaMA includes four parameter scales of 7 billion, 13 billion, 33 billion, and 65 billion, aiming to promote the miniaturization and democratization of LLM research. In contrast, the GPT-3 reached a parameter scale of 175 billion at its highest. Meta summarized in the paper at the time that although the volume was more than 10 times smaller, the performance of LLaMA (13 billion parameters) was superior to GPT-3.
Generally speaking, smaller models have lower costs, run faster, and are easier to fine tune. As Meta CEO Zuckerberg stated in a previous earnings conference call, open source models are often safer, more efficient, and more cost-effective to run, constantly subject to community scrutiny and development.
When it comes to open source issues, Zuckerberg also said in an interview with foreign media The Verge, "I tend to believe that one of the biggest challenges is that if what you create is really valuable, it will eventually become very focused and narrow. If you make it more open, then you can solve a lot of the problems that inequality of opportunity and value may bring. Therefore, this is an important component of the entire open source vision."
In addition, small models also make it easy for developers to develop AI software on mobile devices, which is why the LLaMA series models have received widespread attention from developers since they were open-source. Currently, many models on Github are developed based on the LLaMA series models.
By July last year, Meta had released LLaMA 2 again. At that time, Meta also adopted a strategy of starting with small models. Before releasing the large-scale version of LLaMA 2 with 70 billion parameters, Meta first released small versions with 13 billion and 7 billion parameters.
However, according to relevant tests, LLaMA 2 refuses to answer some less controversial questions, such as how to prank friends or how to "kill" car engines. In recent months, Meta has been working to make LLaMA 3 more open and accurate in answering controversial questions.
Although Meta did not disclose the parameter size of LLaMA 3, according to The Information report, as a benchmark model for GPT-4, the large-scale version of LLaMA 3 has a parameter size of over 140 billion, which is twice that of the largest version of LLaMA 2.
In the entire open source model industry, competition is becoming increasingly fierce, and open source big models are also evolving to become stronger.
In February of this year, Google made a rare change from last year's insistence on a closed source strategy for large models and launched the open source large model Gemma; In March, Musk also opened up the Grok-1 model of his xAI company. According to the performance testing documents of Gemma and Grok-1, their performance in multiple benchmark tests such as mathematics, reasoning, and code has exceeded that of LLaMA 2 models of the same scale.
As of now, multiple technology companies including Google, xAI, Mistral AI, DataBricks, and StabilityAI have released open-source big models. An industry insider previously said in an interview with the Daily Economic News, "Open source is the trend, and I believe Meta is leading this trend, followed by smaller companies such as Mistral AI and HuggingFace."
Crazy AGI: Spending $10 billion to hoard chips
By the end of the year, the computing power will be equivalent to approximately 600000 H100
As a leader in the open source modeling industry, Meta's investment in AI infrastructure cannot be underestimated.
In fact, Meta posted a technology blog last month showcasing its computing resources, as well as the details and roadmap for laying out AI infrastructure. The company stated that its long-term vision is to build an open and responsible General Artificial Intelligence (AGI) that can be widely used and benefit everyone.
Meta wrote in her blog, "By the end of 2024, our goal is to continue expanding (AI) infrastructure construction, including 350000 Nvidia H100 GPUs, which is part of its product portfolio. Including others, its computing power is equivalent to nearly 600000 H100." It is reported that currently only Microsoft has a comparable reserve of computing power. According to the price given on Amazon, a single H100 chip costs approximately $30000, and the price of 350000 H100s is $10.5 billion (approximately 76 billion RMB).
In this document, Meta also revealed some cluster details for training LLaMA 3, which consists of 24576 Nvidia H100 GPUs.
According to a report released by market tracking company Omdia last year, Meta and Microsoft are the largest buyers of Nvidia H100 GPUs. According to its estimation, the two companies mentioned above each purchased up to 150000 H100 GPUs in 2023, which is more than three times the number of H100 GPUs purchased by technology companies such as Google, Amazon, and Oracle.
In the aforementioned document, Meta also reiterated its commitment to the consistent open source approach, "Meta has always been committed to open innovation in artificial intelligence software and hardware. We believe that open source hardware and software will always be valuable tools to help the industry solve problems on a large scale."
It is worth mentioning that with his investment in AI, Zuckerberg ranks fourth on Forbes' latest 2024 (38th) Global Billionaires List, with a net asset value of $177 billion, which is also the highest ranking Zuckerberg has ever recorded. In US dollars, Zuckerberg's net asset value has grown the most in the past year, with a total increase of 112.6 billion US dollars, a growth rate of 174.8%.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Nvidia suddenly opens up!
- Nvidia Open Source 340 Billion Parameter Model Nemotron-4 340B
- Nvidia and other giants exposed for illegally using YouTube data to train models involving 170000 videos
- Meta releases the strongest open-source model Llama 3.1, Zuckerberg: it will become a turning point in the industry
- Meta releases "industry-leading" open-source artificial intelligence (AI) model Llama 3.1
- Meta releases open-source big model Llama 3.1 with strong support from Nvidia
- Huang Renxun, Zuckerberg supports AI big model open source, two people exchange jackets to express brotherly love
- Robin Lee's internal speech exposes that the open source model is not efficient enough to solve the problem of computing power
- Alibaba Tongyi Qianwen Code Model Qwen2.5-Coder Full Series Officially Open Source
- Alibaba CEO Wu Yongming: AI development requires a batch of open-source models of different scales and fields
-
米東時間の金曜日、米株3大指数は集団で上昇し、終値を締め切ると、ダウは0.97%上昇し、今週は1.96%上昇した。納指は0.16%上昇し、今週は1.73%上昇した。スタンダード500指数は0.35%上昇し、今週は1.68%上昇した。 ...
- leekhy
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
イランは現地時間の金曜日(11月22日)、国際原子力機関(IAEA)がイランに圧力をかけている新たな決議に応えるため、大量の遠心分離機を追加してウラン濃縮を行うと発表した。 IAEA理事会は木曜日、イランの協力不 ...
- 1900_后
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
9月以来、香港株の1日当たりの出来高は大幅に増加し、流動性の回復は中資証券会社の業績の反発を牽引し、富途証券、タイガー証券などのインターネット証券会社の年間獲得目標は早期に達成された。 証券会社の中国人 ...
- 拓牛李强
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
蔚来法務部は11月22日、インターネット上に流出した蔚来と他の企業の資本レベルのデマについて、会社が最初に通報し、受理されたと発表した。会社はすでにこのデマの全リンク推進過程と、マルチプラットフォームが ...
- 愿为素心人
- 3 天前
- 支持
- 反对
- 回复
- 收藏