首页 News 正文

On July 23rd Pacific Time, Meta (formerly known as Facebook) officially released its Llama 3.1 model, which has three sizes: 8B, 70B, and 405B, and the context length has been increased to 128K. It is worth noting that according to benchmark data provided by Meta, the most highly anticipated 405B (405 billion parameters) is already comparable in performance to GPT-4 under OpenAI and Claude 3 under Anthropic, an artificial intelligence startup. This means that top open source models have officially caught up with top closed source models in terms of performance, and the battle between open and closed sources may come to an end.
In addition to releasing products, Zuckerberg also released an "Open Source Manifesto" called "Open Source Artificial Intelligence is the Way Forward". Zuckerberg stated in the article, "Today, several tech companies are developing leading closed models. But open source is rapidly narrowing the gap
Open source Llama 3.1-405B outperforms closed source GPT-4 in performance
According to official sources, Llama 3.1 was trained on data of over 15 trillion tokens, using 16000 H100 tokens. The pre training data used is up to December 2023. To ensure training stability, only the Transformer model architecture was adjusted instead of the popular Hybrid Expert Model (MoE) architecture.
At present, Llama 3.1 supports language dialogue in various countries, and three sizes of 8B, 70B, and 405B have been released this time, with the context length increased to 128K. Sima Huapeng, founder of silicon-based intelligence, commented that the information processing capability of Llama 3.1 has greatly improved, "for example, it's like going from being able to remember only 4000 Chinese characters to being able to remember 64000 Chinese characters.
For a long time, there have been endless discussions in the industry about open and closed sources. At the World AI Conference this month, Robin Lee, the founder, chairman and CEO of Baidu, said again on the spot that "the commercial closed source model is the best". Robin Lee said that the open source model is valuable in some academic research and teaching fields and can be used to study the working mechanism of the big model and form a theory. But when faced with a fierce business environment, the commercialized closed source model is the most effective way to achieve higher business efficiency and lower costs than peers.
However, according to the benchmark data provided by Meta, the open-source model is also very "capable" this time. Among them, the 405B (405 billion parameters) with the most attention on Llama 3.1 is already comparable in performance to GPT-4 and Claude 3, which means that top open-source models have caught up with flagship closed source models.
It is worth noting that this open source is more thorough. When launching the Llama 3 8B and Llama 3 70B products in April this year, Meta still prohibited developers from using this model to train other generative models. In the new open source protocol released this time, Meta no longer prohibits the use of new models to improve other models.
At the same time as Meta launches a new model, Nvidia also announced the launch of new NVIDIA AI Foundry services and NVIDIA NIM inference microservices, along with the newly launched Llama 3.1 series open source models, providing strong support for generative AI for global enterprises. It is reported that with the help of NVIDIA AI Foundry, enterprises and countries can now use Llama 3.1 and NVIDIA software, computing, and expertise to create custom "super models" for industry use cases in their specific fields.
At the same time as releasing the product, Zuckerberg also released an open letter titled 'Open Source Artificial Intelligence is the Way Forward'. Zuckerberg takes the development of early Linux (operating system kernel) as an example. He proposed that in the early days of high-performance computing, major technology companies invested heavily in developing their own closed source versions of Unix, and it was hard to imagine any other way to develop such advanced software. But ultimately, open-source Linux became popular - initially because it allowed developers to freely modify code at a more affordable price, and over time it became more advanced, secure, and had a wider ecosystem that supported more features than any closed source Unix. Nowadays, Linux is the industry standard foundation for cloud computing and operating systems that run most mobile devices.
Zuckerberg said he believes that artificial intelligence will also develop in a similar way. Today, several technology companies are developing leading closed source models, but open source is rapidly narrowing the gap. Last year, we released Llama 2, which was only comparable to the outdated previous generation model. By this year, Llama 3 has been able to rival the most advanced models and is leading in certain fields. Starting from next year, we expect the future Llama model to become the most advanced model in the industry
We are further developing the image, video, and voice functions of Llama 3
When asked why open source is more beneficial for developers, Zuckerberg listed some phenomena he observed during his research process: for developers, CEOs, and government officials around the world, they need to train, fine tune, and refine their own models; We also need to have some control over the model and do not want to be constrained by a closed supplier. At the same time, I also hope to protect my data and do not want to send data cloud APIs to closed source models; I am more looking forward to investing in ecosystems that will become long-term standards, and many people believe that the development speed of open source models is faster than that of closed source models.
Zuckerberg also mentioned that for Meta, choosing an open source model is more conducive to Meta's vision of continuing to create the best user experience. Regarding the question of whether open source will cause the Llama series of large models to lose their technological advantages, Zuckerberg provided answers from the aspects of the open integrity of the ecosystem and Meta's commercialization path on large models.
Firstly, in order to ensure that we can use the best technology and not be trapped in a closed ecosystem for a long time, Llama needs to develop into a complete ecosystem, including tools, efficiency improvements, chip optimization, and other integrations. If we were the only company using Llama, then this ecosystem would not develop. Secondly, I anticipate that artificial intelligence development will continue to maintain a high level of competition, which means that open sourcing any given model will not lose its huge advantage over the next best model at the time. The path for Llama to become an industry standard is to maintain competitiveness, efficiency, and openness generation after generation; Thirdly, a key difference between Meta and closed source model providers is that selling access to AI models is not our business model. This means that publicly releasing Llama will not weaken our revenue, sustainability, or ability to invest in research like closed source providers do, which is also one of the reasons why some closed source providers continue to lobby the government against open source
Llama internal scientist @ astonzhangAZ also revealed on social media that the research team is currently considering integrating image, video, and voice functions into Llama 3, so that the model can recognize images and videos and support interaction through voice.
您需要登录后才可以回帖 登录 | 立即注册


m337283 新手上路
  • 粉丝


  • 关注


  • 主题
