Meta releases strongest open-source model to catch up with GPT-4, Xiaozha: overtake next year
m337283
发表于 2024-7-24 14:01:02
3568
0
0
On July 23rd Pacific Time, Meta (formerly known as Facebook) officially released its Llama 3.1 model, which has three sizes: 8B, 70B, and 405B, and the context length has been increased to 128K. It is worth noting that according to benchmark data provided by Meta, the most highly anticipated 405B (405 billion parameters) is already comparable in performance to GPT-4 under OpenAI and Claude 3 under Anthropic, an artificial intelligence startup. This means that top open source models have officially caught up with top closed source models in terms of performance, and the battle between open and closed sources may come to an end.
In addition to releasing products, Zuckerberg also released an "Open Source Manifesto" called "Open Source Artificial Intelligence is the Way Forward". Zuckerberg stated in the article, "Today, several tech companies are developing leading closed models. But open source is rapidly narrowing the gap
Open source Llama 3.1-405B outperforms closed source GPT-4 in performance
According to official sources, Llama 3.1 was trained on data of over 15 trillion tokens, using 16000 H100 tokens. The pre training data used is up to December 2023. To ensure training stability, only the Transformer model architecture was adjusted instead of the popular Hybrid Expert Model (MoE) architecture.
At present, Llama 3.1 supports language dialogue in various countries, and three sizes of 8B, 70B, and 405B have been released this time, with the context length increased to 128K. Sima Huapeng, founder of silicon-based intelligence, commented that the information processing capability of Llama 3.1 has greatly improved, "for example, it's like going from being able to remember only 4000 Chinese characters to being able to remember 64000 Chinese characters.
For a long time, there have been endless discussions in the industry about open and closed sources. At the World AI Conference this month, Robin Lee, the founder, chairman and CEO of Baidu, said again on the spot that "the commercial closed source model is the best". Robin Lee said that the open source model is valuable in some academic research and teaching fields and can be used to study the working mechanism of the big model and form a theory. But when faced with a fierce business environment, the commercialized closed source model is the most effective way to achieve higher business efficiency and lower costs than peers.
However, according to the benchmark data provided by Meta, the open-source model is also very "capable" this time. Among them, the 405B (405 billion parameters) with the most attention on Llama 3.1 is already comparable in performance to GPT-4 and Claude 3, which means that top open-source models have caught up with flagship closed source models.
It is worth noting that this open source is more thorough. When launching the Llama 3 8B and Llama 3 70B products in April this year, Meta still prohibited developers from using this model to train other generative models. In the new open source protocol released this time, Meta no longer prohibits the use of new models to improve other models.
At the same time as Meta launches a new model, Nvidia also announced the launch of new NVIDIA AI Foundry services and NVIDIA NIM inference microservices, along with the newly launched Llama 3.1 series open source models, providing strong support for generative AI for global enterprises. It is reported that with the help of NVIDIA AI Foundry, enterprises and countries can now use Llama 3.1 and NVIDIA software, computing, and expertise to create custom "super models" for industry use cases in their specific fields.
At the same time as releasing the product, Zuckerberg also released an open letter titled 'Open Source Artificial Intelligence is the Way Forward'. Zuckerberg takes the development of early Linux (operating system kernel) as an example. He proposed that in the early days of high-performance computing, major technology companies invested heavily in developing their own closed source versions of Unix, and it was hard to imagine any other way to develop such advanced software. But ultimately, open-source Linux became popular - initially because it allowed developers to freely modify code at a more affordable price, and over time it became more advanced, secure, and had a wider ecosystem that supported more features than any closed source Unix. Nowadays, Linux is the industry standard foundation for cloud computing and operating systems that run most mobile devices.
Zuckerberg said he believes that artificial intelligence will also develop in a similar way. Today, several technology companies are developing leading closed source models, but open source is rapidly narrowing the gap. Last year, we released Llama 2, which was only comparable to the outdated previous generation model. By this year, Llama 3 has been able to rival the most advanced models and is leading in certain fields. Starting from next year, we expect the future Llama model to become the most advanced model in the industry
We are further developing the image, video, and voice functions of Llama 3
When asked why open source is more beneficial for developers, Zuckerberg listed some phenomena he observed during his research process: for developers, CEOs, and government officials around the world, they need to train, fine tune, and refine their own models; We also need to have some control over the model and do not want to be constrained by a closed supplier. At the same time, I also hope to protect my data and do not want to send data cloud APIs to closed source models; I am more looking forward to investing in ecosystems that will become long-term standards, and many people believe that the development speed of open source models is faster than that of closed source models.
Zuckerberg also mentioned that for Meta, choosing an open source model is more conducive to Meta's vision of continuing to create the best user experience. Regarding the question of whether open source will cause the Llama series of large models to lose their technological advantages, Zuckerberg provided answers from the aspects of the open integrity of the ecosystem and Meta's commercialization path on large models.
Firstly, in order to ensure that we can use the best technology and not be trapped in a closed ecosystem for a long time, Llama needs to develop into a complete ecosystem, including tools, efficiency improvements, chip optimization, and other integrations. If we were the only company using Llama, then this ecosystem would not develop. Secondly, I anticipate that artificial intelligence development will continue to maintain a high level of competition, which means that open sourcing any given model will not lose its huge advantage over the next best model at the time. The path for Llama to become an industry standard is to maintain competitiveness, efficiency, and openness generation after generation; Thirdly, a key difference between Meta and closed source model providers is that selling access to AI models is not our business model. This means that publicly releasing Llama will not weaken our revenue, sustainability, or ability to invest in research like closed source providers do, which is also one of the reasons why some closed source providers continue to lobby the government against open source
Llama internal scientist @ astonzhangAZ also revealed on social media that the research team is currently considering integrating image, video, and voice functions into Llama 3, so that the model can recognize images and videos and support interaction through voice.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Robin Lee's internal speech is exposed! Poured cold water on open source big models
- Robin Lee's latest voice: the open source model will fall behind
- Robin Lee: The open source model will fall behind
- Meta releases the latest open-source big model Llama 3, continuously catching up with OpenAI
- Meta releases the latest open-source big model Llama 3
- IBM announces its "open source" strategy: releasing Granite series models to focus on code generation
- Google releases Gemma 2 open-source AI model
- Meta releases strongest open-source AI model Llama 3.1, Zuckerberg discusses company strategy in detail
- Meta releases' Strongest Open Source Model ', opening a new page in the battle between open source and closed source. The big model may face a reshuffle
- Will DeepMind's open-source biomolecule prediction model win the Nobel Prize and ignite a wave of AI pharmaceuticals?
-
米東時間の金曜日、米株3大指数は集団で上昇し、終値を締め切ると、ダウは0.97%上昇し、今週は1.96%上昇した。納指は0.16%上昇し、今週は1.73%上昇した。スタンダード500指数は0.35%上昇し、今週は1.68%上昇した。 ...
- leekhy
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
イランは現地時間の金曜日(11月22日)、国際原子力機関(IAEA)がイランに圧力をかけている新たな決議に応えるため、大量の遠心分離機を追加してウラン濃縮を行うと発表した。 IAEA理事会は木曜日、イランの協力不 ...
- 1900_后
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
9月以来、香港株の1日当たりの出来高は大幅に増加し、流動性の回復は中資証券会社の業績の反発を牽引し、富途証券、タイガー証券などのインターネット証券会社の年間獲得目標は早期に達成された。 証券会社の中国人 ...
- 拓牛李强
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
蔚来法務部は11月22日、インターネット上に流出した蔚来と他の企業の資本レベルのデマについて、会社が最初に通報し、受理されたと発表した。会社はすでにこのデマの全リンク推進過程と、マルチプラットフォームが ...
- 愿为素心人
- 3 天前
- 支持
- 反对
- 回复
- 收藏