首页 News 正文

Today, Apple Intelligence has finally been launched overseas. Apple plans to launch a new batch of AI features in December, including integrating ChatGPT services into Siri and introducing more powerful image generation tools. However, it is still uncertain when Apple Intelligence will be available in the Chinese market.
In recent days, domestic Android manufacturers have also released a series of upgrades in the field of end-to-end AI and operating systems, with various concepts of AI agents and AI OS flooding the market.
It can be said that the window period when Apple's intelligent products have not yet landed in the Chinese market is prompting domestic mobile phone manufacturers to further enhance their competition with flagship models by leveraging AI selling points. Each mobile phone manufacturer has made it clear that they need to develop system level AI, AI OS, and AI agents.
IDC China Research Manager Guo Tianxiang told the Science and Technology Innovation Board Daily that the Android camp and Apple have similar ideas in AI, both focusing on the concept of end-to-end models and intelligent agents. In terms of AI, China is not lagging behind much
Can intelligent agents take down apps?
As the voice assistant for iPhone, Siri can perform simple operations through voice commands. However, due to the fact that previous responses were mainly based on search engines and had very limited intelligence, they were not able to replace the mobile app interaction mode. With the development of large-scale models, mobile assistants represented by Siri are expected to become more intelligent, upgrading from voice assistants to AI agents. For example, if you want to book a hotel itinerary, you don't need to open the app anymore, but can directly communicate with the AI agent to complete it.
When it comes to whether intelligent agents will replace apps, Honor CEO Zhao Ming believes that it is likely to develop in this direction, but for a considerable period of time, apps and intelligent agents will definitely coexist in a long-term way. This involves a problem of usage habits and various unexpected experiential barriers, so it must coexist for a considerable period of time, or may coexist in the long term
As the first step in the interaction mode of intelligent agents, AI screen recognition has begun to be implemented on Android phones in China. The latest OPPO Find X8 has launched a one click screen inquiry function, which can intelligently analyze screen information and interact with users based on the content, providing corresponding answers and operations.
For example, the photos of scenic spots taken in daily life can be easily recognized and answered by AI with just one click, including where they are and what stories they have behind them. It may seem simple, but it involves over 16000 3A level scenic spots across the country, with almost a million level of data for specialized training, "said Zhang Jun, Product Director of OPPO AI Center.
Honor has released MagicOS 9.0, an AI operating system equipped with intelligent agents. Zhao Ming introduced that now intelligent agents can simulate humans to click on screens, read and understand screens, think slowly, find key information, and then perform corresponding operations. At present, it can be divided into two categories: intelligent agents for "autonomous driving" and intelligent agents that interact with applications.
"The automatic driving agent, without the intervention of a third party, first analyzes and understands the user's intention. For example, if I order a drink, the agent can understand the information and logic behind the intention, decompose the intention scene, turn it into executable instructions, and finally achieve the operation of ordering coffee. The other requires the cooperation of the application side. For example, Glory and China Mobile's Lingxi big model. In the process of querying the balance of mobile phone charges and charging 50 yuan, Lingxi is called to take over. The two kinds of agents mentioned above must coexist in the future, there will be parts that need ecological intervention, and some operations can be carried out automatically."
For the future development of AI interaction on mobile phones, many industry insiders believe that the most intuitive and direct way will eventually dominate.
Guo Tianxiang stated that screen recognition interaction is a new way of interaction for AI smartphones, which is more convenient for users to use and reduces learning costs. From the current perspective, AI interaction in the future will still be mainly based on the most direct and simple way, starting from human instincts.
OPPO Chief Product Officer Liu Zuohu also believes that intuition is the most fundamental AI concept.
I hold an AI themed meeting every week, always instilling a concept: no matter what it is, it must first be intuitive. We see many things that may show off their skills, which may seem simple, but actually have high technical requirements behind them. It's like a one click question screen, how to identify the user's intention and screen, involving a lot of diversion techniques. But ultimately, technology must return to the user to make the product. For example, when using navigation, there is an address when opening, and you can directly press it to reach the destination. In the AI era, intuition is more efficient, which is the most basic AI concept
End side model: The difficulty lies in balancing experience and performance
The large model needs to be loaded into a mobile phone, with unlimited prospects and challenges coexisting. The limited computing power on mobile phones determines that the end side model cannot be too large, but small parameter models are limited in their capabilities.
Guo Tianxiang stated that the current end-to-end models no longer overly emphasize the size of model parameters, but instead pursue a balance between user experience, memory usage, and power consumption.
Liu Zuohu admitted that the end-to-end model has high performance requirements, both in terms of performance and memory. So how to continuously optimize the architecture and unleash the potential of chips with high energy efficiency is still a long road ahead.
There are still many things that can be done at present. For example, cooling the platform, which may seem simple to many, is actually difficult. There are also ways to do a good job in calling the underlying memory, and so on. To be honest, AI is still in its infancy in the mobile phone industry, and we will see a lot of AI changes in the future
Zhang Jun revealed that OPPO is about to launch a new end-to-end architecture AI LoRA to reduce the usage of memory and other resources.
The biggest bottleneck of end-to-end AI is the use of mobile computing resources. For example, implementing three functions on a mobile phone at the same time would normally occupy three corresponding resources. If we compare the model to a locomotive, having three models means that three locomotives and a carriage are needed. The LoRA architecture adopts a base model+application model mode. Only one base model is needed, that is, only one locomotive. The subsequent application models are equivalent to three carriages, like a revolver, which can be rotated. When a model is needed, it can be loaded into any carriage, and this peak memory usage can be saved by 75%
In the post era of AI smartphones, intelligent agents will replace more manual operations
For the development of large models, the industry generally maintains a short-term cautious and long-term optimistic attitude. The same applies to the landing on the end side.
Liu Zuohu analyzed that the changes in the AI era are very fast. In the past, we planned for mobile operating systems based on half a year or a year. However, in the era of AI, this is definitely not the case. Who knows what AI will look like a year from now. AI products are not even planned every three months, they should be planned every month. The model changes too quickly, and the technology exceeds the speed of imagination. To be honest, I myself have a great sense of urgency
Liu Zuohu emphasized that in the era of AI, making products means running fast. You must run fast, otherwise you will fall behind. Keep up with the changes in technology.
Recently, the China Academy of Information and Communications Technology released the world's first "Research Report on Terminal Intelligence Grading", which divides the level of terminal intelligence into five levels L1-L5. The higher the level of intelligence, the higher the degree of terminal autonomous participation, and the lower the degree of human participation. L1 and L2 levels have a certain level of intelligence and can complete single type tasks. Level 3 and Level 4 gradually move from perceiving and recognizing complex intentions to recognizing potential intentions. Level 5 has comprehensive intelligence and can independently plan and complete all types of tasks.
Zhao Ming stated that the current level of terminal intelligence is at the L3 level, and it will take longer and more accumulation to reach the next L4 and L5 stages.
Today, we can achieve a user understanding category of 950. In the future, we will definitely be able to cover many aspects of mobile phone operation, gradually eliminating the areas that require more human intervention on traditional phones. Now, making a phone call with just one sentence is no problem, WeChat videos are fine, and ordering coffee can also be done. Going forward, we need to achieve more and more vague instructions, as well as more complex understanding between relationships
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

因醉鞭名马幌 注册会员
  • 粉丝

    0

  • 关注

    0

  • 主题

    43