JD technical leader: Large models will become smaller and even finer down to the scene
四夜父脚群
发表于 2024-7-31 19:01:30
1237
0
0
General big models rely on computing power to build, while enterprise big models rely on business to run out
On July 30th, at the JD Cloud Summit held in Shanghai, Cao Peng, Chairman of the Technical Committee of JD Group and President of JD Cloud Business Unit, expressed the above views. According to his understanding, for large models, data is nourishment and scenarios are training grounds.
Over the past year, there has been a sustained craze for big models, and the industry has experienced a 'thousand model war'. According to statistics from the China Academy of Information and Communications Technology, there are currently over 1000 basic large-scale models worldwide, with China accounting for 35% of the global total.
Although the performance of basic models is constantly improving, in the personal user end, large models have not yet achieved true super applications. Instead, in many enterprise scenarios, they have gradually been deployed based on applications.
At the summit, JD Cloud showcased the latest practices of JD Yanxi's big model landing industry and released eight products including JD Cloud Enterprise Big Model Service, Yanxi Intelligent Agent Platform, Intelligent Programming Assistant JoyCoder, and Yanxi Digital Person 3.0.
According to data provided by JD.com, as of now, JD's big model has been implemented in over a hundred scenarios, covering different industries such as healthcare, e-commerce live streaming, logistics, and finance. Many of JD's own delivery personnel, merchants, doctors, procurement and sales operations, and R&D personnel have received support from the big model application.
For example, the "Jingyi Qianxun" service that serves medical scenarios, according to the head of JD Health Intelligent Algorithm Department, currently has four different sized models internally. One is a small model of about 2b, which provides a single service in a narrow domain. The team envisions that it can even be used on mobile phones in the future; The second is a medium-sized model with 14b and 22B as the core, which completes some medical consulting and service support work; Finally, there is a large model centered around 80s that specializes in serving complex medical decision-making and reasoning abilities.
The above model supports private deployment, even integrated deployment, which is related to industry characteristics. "It is difficult for the medical industry to accept a completely cloud based model, and few hospitals can accept this breakthrough," said the person in charge.
According to its introduction, in actual hospital implementation scenarios, Beijing Medical Qianxun will pay more attention to independently completing patient services in compliance, including triage, pre consultation, registration, appointment, accompanying consultations during consultations, and post consultation health management.
On the first day of GPT's release, everyone thought about the natural conversational ability and so-called anthropomorphic ability of this generation. From this perspective, whether it can better become a doctor's assistant is more valuable than becoming a diagnostic tool for doctors, "the person in charge emphasized.
In the beauty scene, unlike pure live streaming in the past, JD.com is currently attempting to combine digital person makeup testing with digital person anchors internally; In terms of footwear and clothing scenes, there will be a scene where digital people live stream in the front and hosts change their outfits in the back. The live streaming style based on specific category attributes will be transferred to digital people.
When it comes to the development trend of large models, several technical leaders from JD.com have stated that large models will become smaller and smaller. Vertical large models are a relatively certain direction, and can even be further refined to scene large models. The inherent logic is that large models need to adapt to scenarios and industries, so they cannot be too large.
He Xiaodong, Dean of JD Exploration Research Institute and Head of JD Technology's Artificial Intelligence Business, believes that due to limitations in data and computing power, simply increasing the scale of the model may quickly reach the development ceiling, resulting in the economic benefits generated by the large model being insufficient to support its own costs, making it difficult to sustain.
The large-scale models are growing at a rate of 10 times per year, with parameters ranging from billions to trillions. However, commercialization is currently lagging behind and will eventually become a problem in the medium to long term. He also pointed out that the illusion rate of many models is still high, which cannot provide solid guarantees for future industrial applications.
According to He Xiaodong, JD.com starts from the initial strategy model in terms of model self evolution. Firstly, it constructs an initial preference dataset, and then uses a pre trained reward model to score each answer. Based on the high or low score, it constructs new preference data, which will greatly promote model iteration and updates.
In terms of model inference, the cost of big language model inference is currently skyrocketing. Therefore, JD.com has improved model construction efficiency through end-to-end, low bit, high-precision quantization technology, reducing model size and enhancing inference performance without affecting model output accuracy and parameter quantity. He Xiaodong said that his current technical solution has saved 70% of the model's video memory.
When it comes to the large-scale model of enterprise implementation, Cao Peng believes that there are three key points. Firstly, simplicity is crucial. The diversity and fragmentation of scenarios cannot sustain high development costs, and it is necessary to minimize the threshold for using large models in order to cover more applications. Next is openness, based on an open Agent ecosystem, large model ecosystem, and cloud native ecosystem, giving customers the right to choose. The third is security, providing data security and privacy protection, AIGC content compliance, corpus data security management, making enterprise big model services trustworthy and reliable.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- JD Seven Fresh responds to price war rumors: no one targeted, just offering low prices
- JD Seven Fresh reduces prices, Meituan Xiaoxiang follows the trend of instant retail and the smoke of gunpowder rises again
- JD Seven Fresh initiates an instant retail price war
- The Nasdaq Golden Dragon Index fell over 4%, Pinduoduo fell over 6%, and JD.com fell over 6%
- JD.com announces Double 11 results, with a year-on-year increase of over 20% in the number of shopping users
- The number of shopping users on JD.com 'Double 11' has increased by over 20% year-on-year
- JD 11.11: The number of shopping users increased by over 20% year-on-year
- Double Eleven data revealed: cumulative sales exceeded 1.4 trillion yuan, with JD 3C Digital accounting for 42.8%
- JD's revenue growth accelerates in the third quarter, with executives revealing plans to increase investment in clothing and beauty
- A sudden fire broke out in the logistics park! JD releases statement
-
11月21日、2024世界インターネット大会烏鎮サミットで、創業者、CEOの周源氏が大会デジタル教育フォーラムとインターネット企業家フォーラムでそれぞれ講演、発言したことを知っている。周源氏によると、デジタル教 ...
- 不正经的工程师
- 昨天 16:36
- 支持
- 反对
- 回复
- 收藏
-
スターバックスが中国事業の株式売却の可能性を検討していることが明らかになった。 11月21日、外国メディアによると、スターバックスは中国事業の株式売却を検討している。関係者によると、スターバックスは中国事 ...
- 献世八宝掌
- 前天 16:29
- 支持
- 反对
- 回复
- 收藏
-
【意法半導体CEO:中国市場は非常に重要で華虹と協力を展開】北京時間11月21日、意法半導体(STM.N)は投資家活動の現場で、同社が中国ウェハー代工場の華虹公司(688347.SH)と協力していると発表した。伊仏半導体 ...
- 黄俊琼
- 前天 14:29
- 支持
- 反对
- 回复
- 收藏
-
【ナスダック中国金龍指数は1%下落した。人気の中概株の多くは下落した】現地時間11月21日、ナスダック中国金龍指数は1%下落し、人気の中概株の多くは下落し、必死に10%超下落し、愛奇芸は7%超下落し、百度は6%近く ...
- 比尔992
- 昨天 11:57
- 支持
- 反对
- 回复
- 收藏