ByteDance and Alibaba launch another price war, Baidu closely follows! What is the price chart for selling cabbage in a large model?

The price war for large models is almost going crazy.
On the morning of May 21st, Alibaba Cloud announced a 97% price reduction for the Qwen Long, the main model of the Tongyi Qianwen GPT-4, which can buy 2 million tokens for 1 yuan.
This move has a bit of a finger at Byte. Previously, on May 15, ByteDance released the big model of bean curd, announcing that the market price of its main model was 0.0008 yuan/thousand tokens, which was 99.3% cheaper than the industry average price. After Alibaba Cloud lowered its price, its API input price decreased from 0.02 yuan/thousand tokens (in text units) to 0.0005 yuan/thousand tokens.
Despite a 97% price reduction, Alibaba Cloud's price advantage has only been maintained for a few hours. On the afternoon of the 21st, Baidu AI Cloud announced that the two main models of Wenxin model were free and effective immediately, including ERNIE Speed and ERNIE Lite.
"As long as there is a price reduction, we have to follow suit, otherwise we may fall behind," Zhang Junlin, the head of new technology research and development on Sina Weibo, told First Financial reporters in response to recent price reductions by large model manufacturers. Behind the fierce and brutal price war, large model manufacturers have many reasons to follow.
Large models trigger a wave of price reductions
In fact, this round of large-scale model price reduction has been showing signs since May.
On May 6th, DeepSeek-V2, a second-generation MoE (expert model), was released by DeepSeek, a subsidiary of Magic Square Quantization. The API (interface) is priced at 1 yuan per million tokens for input and 2 yuan for output (32K context), which is nearly one percent of the price of GPT-4-Turbo.
On May 13th, the Zhipu Big Model Open Platform launched a new pricing system, reducing the entry-level product GLM-3 Turbo model call price by 80% to 1 yuan/million tokens. Subsequently, OpenAI launched the GPT-4o, priced at half the price of the GPT-4 Turbo, with input and output charges of $5 and $15 per million tokens. The Byte Doubao big model has joined the trend of price reduction.
With the varying degrees of price reductions for Alibaba Cloud's four Tongyi Qianwen commercial models and three Tongyi Qianwen open-source models, as well as the comprehensive free availability of the two main models, Wenxin Big Model, big model manufacturers seem to be squeezing out their profit margins.
Why can large models have such a significant price reduction? Alibaba Cloud stated that this is mainly due to the cost and performance advantages brought about by the technological dividends and economies of scale of public clouds. Through continuous optimization at both the model and AI infrastructure levels.
Canalys Cloud Analyst Zhang Yi told First Financial reporters that the Chinese customer base is actually particularly sensitive to prices, and the price reductions by large model manufacturers are more to attract more customers to use large models. At the same time, many large model manufacturers who have lowered prices are also cloud manufacturers, and the fundamental purpose of cloud manufacturers to lower prices for large models is to drive cloud consumption.
Under the aggressive strategy of constantly breaking through bottom prices and even offering free products, the determination of large model manufacturers to seize the market far exceeds their desire for short-term profits. Chapter 1 analysis shows that Chinese manufacturers themselves are good at using low prices to enter the market and then spreading costs through volume. At present, the actual proportion of customers using AI in the Chinese B2B market is not high. Through price reductions, large model manufacturers hope to lower the threshold for using their large models.
On May 21, in response to Alibaba Cloud's announcement of the price reduction of the main GPT-4 model of Tongyi Qianwen, the person in charge of the ByteDance Volcano Engine responded to the First Finance reporter that he welcomed the price reduction of the Tongyi Qianwen big model, helping enterprises explore AI transformation at a lower cost and accelerating the implementation of the big model application scenarios.
Reducing prices alone is not enough
Behind the price reduction of the large model, it is worth noting that the downward trend in computing power costs is also an industry trend.
Alibaba Cloud stated that its flexible AI computing power scheduling system, combined with the Bailian distributed inference acceleration engine, has optimized large-scale inference clusters, significantly reducing model inference costs and accelerating inference speed.
Tencent Cloud recently mentioned the decrease in computing power costs for large models. Tencent Group Vice President Jiang Jie revealed that in response to the shortcomings of low computing power and small graphics memory of low-end cards, Tencent has used its self-developed Angel training inference platform to schedule heterogeneous card clusters, reducing the inference cost of trillion dollar models by 70% compared to open source.
Volcano Engine President Tan Dai previously stated that ByteDance has reduced costs by optimizing the model structure, changing single machine inference to distributed inference, and hybrid scheduling of cloud computing power. The person in charge of DeepSeek explained on Zhihu that DeepSeek-V2 relies on model structure innovation to balance cost and effectiveness.
But beyond the price reduction, there are more new stories needed in the field of big models.
Liu Weiguang, Senior Vice President of Alibaba Cloud Intelligent Group and President of Public Cloud Business Unit, stated today that when discussing the industry trend of large-scale model price reductions, price wars should follow the basic principles of the market, and price reductions must be inclusive of the market, with the goal of promoting market development and not using traffic as a gimmick.
What kind of enterprise can use price reduction to benefit the market and promote market development? Liu Weiguang mentioned four principles: first, to benefit the market, the basic model capability of the model should be sufficiently advanced; Secondly, it depends on whether the model has real reasoning resources; Thirdly, is the current model being used by many customers to generate commercial value; Finally, is the big model the main business of this company and does the enterprise have strategic determination.
Regarding how to view the big model price war and whether Tencent will lower prices, Wu Yunsheng, Vice President of Tencent Cloud and Head of Tencent Cloud Intelligence, did not directly respond in an interview recently. "We have also been paying attention to the corresponding situation in the industry these days. In fact, we have invested a lot of energy into improving the capabilities of the underlying large models and enabling users to truly use them," Wu Yunsheng told media such as First Financial.
The issue of the "last mile" still lies before the big model. Chapter 1 analysis shows that the price reduction trend of large models may attract some customers in the short term, but in the long run, large models still face the problem of how to generate more landing value in order to attract more users.
What value can big models bring to customers' actual business? Whether the productivity of AI can enable customers to reduce costs, improve the efficiency and actual benefits of the enterprise in the actual process is another level of cost savings for customers, and is also an important issue in the commercialization process of large models. The story of the value generated by the implementation of large models, both domestically and internationally, is still incomplete.

浏览过的版块