首页 News 正文

On the sixth day of the technology sharing day, OpenAI provided something closer to the "heart" - ChatGPT opens advanced voice mode: real-time video calls, screen sharing, and image uploading.
Why is it said to be closer to the 'heart'?
OpenAI CEO Altman previously revealed in an interview with Salesforce that his favorite AI movie is "Her" (the story of a man falling in love with his AI virtual assistant), and "the idea of a conversational language interface has incredible foresight." The Information reported that Altman hopes to eventually develop a virtual assistant that can respond quickly like the AI assistant in the movie.
The robot girlfriend in Her represents the ultimate form of embodied intelligence, which can interact with humans without barriers.
Previously, ChatGPT's DAN mode (short for Do anything now) allowed AI to converse with users in a more casual way, and its emphasis on "human touch" has been stunning. It not only enables low latency communication, but also imitates human tone and provides emotional value. This time, ChatGPT not only enables listening and speaking, but also unlocks visual abilities, allowing users to "open their eyes and see the world" through the camera.
In this live sharing session, CEO Sam Altman did not appear. Instead, four employees including Kevin Weil, OpenAI's Chief Product Officer, Jackie Shannon, OpenAI's Product Manager, Michelle Qin, and Rowan Zellers, members of OpenAI's multimodal technology team, introduced the updated features.
The real-time video call function in advanced voice mode is the most outstanding. After the OpenAI team members greeted ChatGPT video and got to know each other, someone asked: What is the name of the colleague with reindeer antlers? ChatGPT provided accurate answers using Santa Claus's limited voice, demonstrating their "memory" ability.
Next, the team demonstrated how ChatGPT can teach people how to operate a hand brewed coffee device. Just make a "video call" to ChatGPT, and it can teach you step by step based on the equipment in front of you. Throughout the entire demonstration, ChatGPT's voice was natural and friendly, adjusting its tone and even laughing like a human.
The screen sharing function allows ChatGPT to "see" your screen through screen sharing, which is also a real-time video understanding ability. Users only need to click on the advanced voice mode icon in the bottom right corner and select Share Screen from the drop-down menu to receive targeted assistance.
After successfully sharing with OpenAI team members, ChatGPT browsed their messages and requested guidance to reply. ChatGPT showed a "high emotional intelligence" side and suggested praising the other party's Christmas decorations.
It is reported that the advanced voice mode supports over 50 languages, 9 realistic output voice options, and each voice has its own unique tone and features. And the GPT-4o behind it can not only convert speech into text, but also understand and label other functions of audio, such as breathing and emotion.
ChatGPT, which supports over 50 languages, is able to understand real-world scenarios in real-time. This not only greatly enhances the experience of ChatGPT as an AI companion tool, but also demonstrates a more efficient and powerful AI education tool.
The above features will be launched in the ChatGPT mobile app from today onwards, and will be open to all team users as well as most Plus and Pro users in the next week.
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

Katlyn30590 新手上路
  • 粉丝

    0

  • 关注

    0

  • 主题

    2