Meta's most powerful model surpasses GPT-4o, Zuckerberg once again stirs up the debate over open and closed sources
楚一帆
发表于 2024-7-24 14:28:57
175
0
0
After OpenAI suddenly launched a "small model" GPT-4o mini, Meta decided to throw out its large model explosion with super large parameters.
On July 24th, Meta released the open-source large model series Llama 3.1 405B, as well as upgraded models in two sizes: 70B and 8B.
Llama 3.1 405B is considered the strongest open-source model currently available. According to the information released by Meta, the model supports a context length of 128K and has added support for eight languages. It is comparable to flagship models such as GPT-4o and Claude 3.5 Sonnet in terms of general knowledge, operability, mathematics, tool usage, and multilingual translation. Even in human evaluation comparisons, its overall performance is better than these two models.
Meanwhile, the upgraded versions of the 8B and 70B models are also multilingual and have been expanded to 128K context length.
Llama 3.1 405B is the largest model of Meta to date. Meta stated that the training of this model involves over 15 trillion tokens, and in order to achieve the desired results within a reasonable time, the team optimized the entire training stack, using over 16000 H100 GPUs - the first Llama model to be trained on such a large scale of computing power.
This difficult training objective was broken down by the team into multiple key steps. In order to ensure maximum training stability, Meta did not choose the MoE architecture (hybrid expert architecture), but instead adopted the standard Transformer model architecture with only decoders for minor adjustments.
According to Meta, the team also used an iterative post training process, supervised fine-tuning and direct preference optimization for each round, creating the highest quality synthetic data for each round to improve the performance of each ability. Compared to the previous version of Llama, the team has improved and enhanced the quantity and quality of data used before and after training.
At the same time as the explosion of Llama 3.1 405B, Mark Zuckerberg issued a statement titled "Open source AI is the way forward", emphasizing once again the significance and value of open source big models, and directly targeting big model companies such as OpenAI that have taken the path of closed source.
Zuckerberg reiterated the story of open-source Linux and closed source Unix, stating that the former supports more features and a wider ecosystem, and is the industry standard foundation for cloud computing and running most mobile device operating systems. I believe that artificial intelligence will also develop in a similar way
He pointed out that several technology companies are developing leading closed source models, but open source models are rapidly narrowing this gap. The most direct evidence is that Llama 2 was previously only comparable to outdated older generation models, but Llama 3 is now comparable to the latest models and has achieved leadership in certain fields.
He expects that starting next year, Llama 3 will become the most advanced model in the industry - and before that, Llama has already taken a leading position in openness, modifiability, and cost efficiency.
Zuckerberg cited many reasons to explain why the world needs open source models, stating that for developers, in addition to a more transparent development environment to better train, fine tune, and refine their own models, another important factor is the need for an efficient and affordable model.
He explained that for user oriented and offline inference tasks, developers can run Llama 3.1 405B on their own infrastructure at a cost of approximately 50% of closed source models such as GPT-4o.
The debate over the two major paths of open source and closed source has been discussed extensively in the industry before, but the main tone at that time was that each has its own value. Open source can benefit developers in a cost-effective way and is conducive to the technological iteration and development of large language models themselves, while closed source can concentrate resources to break through performance bottlenecks faster and deeper, and is more likely to be the first to achieve AGI (General Artificial Intelligence) than open source.
In other words, the industry generally believes that open source is difficult to catch up with closed source in terms of model performance. The emergence of Llama 3.1 405B may prompt the industry to reconsider this conclusion, which is likely to affect a large group of enterprises and developers who are already inclined to use closed source model services.
At present, Meta's ecosystem is already very large. After the launch of the Llama 3.1 model, over 25 partners will provide related services, including Amazon AWS, Nvidia Databricks、Groq、 Dell, Microsoft Azure, and Google Cloud, among others.
However, Zuckerberg's expectation for the Llama series models to be in a leading position is next year, and there is a possibility that they may be overturned by closed source models in the middle. During this period, the outside world may pay attention to closed source large models that cannot match the performance level of Llama 3.1 405B, and their current situation is indeed somewhat awkward.
He also specifically talked about the competition between China and the United States in the field of big models, believing that it is unrealistic for the United States to always lead China for several years in this area. But even a small lead of a few months can accumulate over time, giving the United States a clear advantage.
The advantage of the United States is decentralization and open innovation. Some people believe that we must close our models to prevent China from acquiring these models, but I think this will not work and will only put the United States and its allies at a disadvantage. "In Zuckerberg's view, a world with only closed models will lead to a few large companies and geopolitical rivals being able to gain leading models, while startups, universities, and small businesses will miss opportunities. In addition, restricting American innovation to closed development increases the possibility of being completely unable to lead.
On the contrary, I believe our best strategy is to establish a strong open ecosystem, allowing our leading companies to work closely with governments and allies to ensure they can make the best use of the latest developments and achieve sustainable first mover advantages in the long term, "said Zackberg.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Zuckerberg meets with LG Electronics executives to discuss cooperation between augmented reality devices and AI technology
- Zuckerberg's "Korean Tour" First Station Achievements: Joining Hands with LG Electronics to Accelerate XR Business Cooperation
- South Korean President Yoon Seok yeol Meets with Zuckerberg to Discuss Plans for Meta and South Korean Enterprises to Strengthen Cooperation
- Meta closed down 4.1% in the top 20 US stock market transactions, as Trump threatened to send Zuckerberg to jail if he takes office
- Meta releases the strongest open-source model Llama 3.1, Zuckerberg: it will become a turning point in the industry
- Huang Renxun's conversation with Zuckerberg: New chip samples sent this week, AI industry still has 5 years of product innovation period
- Zuckerberg suddenly launched an attack!
- Cloud computing giant enters AI talent battle, acquires AI voice agency Tenyx
- Zuckerberg 'Explodes' AI Wearable Devices
-
9月が終わり、映画・テレビ業界が暗躍している。最近、愛奇芸の創始者でCEOのGONG宇氏は、映画・テレビ業界が長短の変化、AIの変化、中国映画・テレビドラマの海外進出の変化の3つの変化を経験していると発表した。 ...
- 寒郁轩良
- 前天 14:17
- 支持
- 反对
- 回复
- 收藏
-
10月1日、理想自動車が9月に納入したデータによると、9月に理想自動車が新車53709台を納入し、前年同月比48.9%増となった。 今年第3四半期、理想自動車は前年同期比45.4%増の152831台を納入した。今年9月30日現在、 ...
- 就放荡不羁就h
- 10 小时前
- 支持
- 反对
- 回复
- 收藏
-
ネットワーク状況監視サイトDownDetectorによると、オーディオストリーミングプラットフォームSpotifyは日曜日に約3時間にわたる障害を経験した後、正常に回復し、ピーク時には米国の4万人以上のユーザーに影響を与 ...
- hecgdge4
- 前天 09:33
- 支持
- 反对
- 回复
- 收藏
-
10月1日、極クリプトン自動車が発表したデータによると、今年第3四半期に新車が累計14万2900台納入され、前年同期比81%増となった。このうち、9月に新車を納入したのは2万13万人で、前年同期比77%、前月比18%増だっ ...
- 内托体头
- 昨天 16:17
- 支持
- 反对
- 回复
- 收藏