Baidu ShenDou: Traditional cloud computing is no longer the protagonist, intelligent computing calls for a new generation of "operating systems"

On April 16th, the Create 2024 Baidu AI Developer Conference was held in Shenzhen.
During the conference, Shen Shuan, Executive Vice President of Baidu AI Cloud Group and President of Baidu Intelligent Cloud Business Group, officially released the new generation of intelligent computing operating system - Wanyuan. Through the abstraction and packaging design of the intelligent computing platform in the AI native era, he shielded the complexity of cloud native systems and heterogeneous computing power for users, and improved the efficiency and experience of AI native application development.
Shen Dou stated that with the continuous evolution of big model technology, programming through natural language is becoming a reality. Programming will no longer be process or object oriented, but rather requirement oriented; The process of programming will become a process for developers to express their wishes and bring revolutionary changes to the operating system. In the kernel of the operating system, the underlying hardware has shifted from CPU computing power to GPU computing power, and has added world knowledge compressed by large models. The objects managed by operating systems have undergone fundamental changes, evolving from managing processes and microservices to managing intelligence.
"Traditional cloud computing systems are still important, but they are no longer the protagonist. We need a brand new operating system that abstracts and encapsulates new computing platforms, namely intelligent computing, redefines human-computer interaction, and provides developers with a simpler and smoother development experience," said Shen Dou.
At this conference, Baidu AI Cloud launched the "Wanyuan" intelligent computing operating system, which aims to "bridge" computing efficiency and application innovation. Specifically, Wanyuan is mainly composed of three layers: Kernel, Shell, and Toolkit. The lower layer shields the complexity of cloud native systems and heterogeneous computing power, while the upper layer provides support and guarantee for agile development of AI native applications.
Firstly, at the kernel level, in terms of computing resource management, Baidu Baige · AI heterogeneous computing platform has made special optimizations to the design, scheduling, and fault tolerance of intelligent computing clusters for tasks such as large model training and inference. At present, Baige is able to achieve over 98.8% of the effective training time of models on the Wanka cluster, with linear acceleration ratio and bandwidth effectiveness reaching 95%, leading the industry in computing power efficiency.
In addition, Baige is also compatible with mainstream domestic and international AI chips such as Kunlun chip, Ascent, Hikvision DCU, Nvidia, Intel, etc., supporting users to complete computing power adaptation with minimal cost. Compared to model inference, "one cloud with multiple cores" is an extremely difficult problem to overcome in model training scenarios, mainly including two types of sub scenarios: 1. There are multiple training tasks in the intelligent computing cluster, and a single vendor chip only serves a single task; 2. Use different vendor chips simultaneously in each independent model training task. This requires solving problems such as evenly dividing the computing power of chips from different manufacturers and optimizing communication efficiency between chips, which is extremely difficult.
It is reported that currently, Baige has achieved mixed training of chips from different manufacturers under a single training task, with a performance loss of no more than 3% for Baika and no more than 5% for Qianka, leading the industry. To maximize the shielding of hardware differences, help users break free from dependence on a single chip, achieve better costs, and create a more flexible supply chain system.
Another important component of the Wanyuan kernel is the large model. Large models can efficiently compress vast amounts of world knowledge and encapsulate the understanding, generation, logic, and memory abilities of natural language. At present, the Wanyuan kernel includes industry-leading ERNIE 4.0 and ERNIE 3.5 language models, as well as lightweight models such as ERNIE Speed/Lite/Tiny, textual and visual models, and various distinctive third-party models, fully meeting the diverse needs of users in different business scenarios.
On top of the kernel layer is the shell layer. Through Baidu AI Cloud Qianfan ModelBuilder, we can solve the management, scheduling, secondary development and other problems of models in the kernel, mask the complexity of model development, and help more people to quickly fine tune models suitable for their own business with only a small amount of data, resources and energy. Meanwhile, in practical applications, the model routing service provided by ModelBuilder can automatically select models with appropriate parameter scales for tasks of different difficulty, and provide the optimal model combination that balances effectiveness and cost. According to calculations, when the model performance is basically the same, the average inference cost of model routing can be reduced by up to 30%.
On top of the Shell layer, Qianfan AppBuilder and AgentBuilder together form the tool layer, providing developers with powerful AI native application development capabilities. Especially with the workflow orchestration function provided by AppBuilder, developers can easily customize their business processes using preset templates and components. They can also integrate and expand their unique components, select suitable models at different nodes, and implement business logic through flexible orchestration.
It is reported that in the process of developing AI native applications on AppBuilder, models that have been finely tuned through ModelBuilder can also be directly called, making the entire development process extremely smooth and convenient. After the application development is completed, it can be released to Baidu Search, WeChat official account and other platforms with one click, or it can be directly integrated into the user's own system through API or SDK, truly achieving rapid development and easy listing.
Shen Dou stated that as an open operating system, Wanyuan will further open up ecological cooperation in the future, providing application developers with more capabilities and interfaces; Assist enterprises in creating exclusive vertical industry operating systems; Deploy Wanyuan in the customer's own intelligent computing center to provide stable, secure, and efficient intelligent computing platform services; Adapt to more heterogeneous chips from manufacturers and maximize their performance.
Shen Dou believes that the current big model technology and AI native applications are driving the development of cloud services towards a new generation of intelligent computing operating systems with AI as the core. This trend not only reflects the inherent logic of technological development, but also reflects the strong driving force of market demand, and opens up a new era of intelligent cloud driven by AI.

浏览过的版块