After the Google big model became popular, it was suspected of being fake! Admitting that the demonstration video was edited: shortened for simplicity

The technology giant Google's new model, Gemini, went viral overnight and was well received by the market. However, some analysts have pointed out that Google is suspected of exaggerating its advertising in Gemini's promotional materials.
On December 6th local time, Google announced the launch of the "largest, strongest, and most versatile" new large-scale language model Gemini. Gemini will be the first large-scale model to run directly on a mobile phone, applied to Google Pixel 8 Pro smartphones and chatbot Bard. Gemini is seen as a direct response to the latest big model GPT-4 under the emerging AI giant OpenAI, and also symbolizes that Google, which was once in a passive state due to the chatbot ChatGPT, has finally officially returned to the track.
According to Google, Gemini achieved a score of 90.0% in MMLU (Massive Multi Task Language Understanding), making it the first model to surpass human experts in MMLU testing. Gemini will include a set of three different scale models, among which Gemini Ultra is positioned as a competitor to GPT-4, Gemini Pro performs better than GPT-3.5, and Gemini Nano is used for specific tasks and mobile devices.
With its powerful performance, Gemini went viral overnight and attracted the attention of Wall Street. On December 7th, the stock price of Google's parent company Alphabet (Nasdaq: GOOG) rose 5.31% to close at $136.93, marking its best performance since August 29th this year, with a total market value of $1.72 trillion.
Analysts from Bank of America pointed out on the 6th that Alphabet has been under some pressure this year due to concerns about Google's AI capabilities, and a "model with a good brand image and strong competitiveness" may attract more consumers to use Google search, And it has a positive impact on the sales of cloud services: "Data shows that Google has top-notch, non replicable AI capabilities, which may have a positive impact on the company's stock trend in the first half of 2024."
Morgan Stanley analysts wrote in a report on the 6th that although the market did not show a clear response to Gemini on the day, seeing Google's progress in this major technological shift was still very "encouraging". However, JPMorgan also pointed out that there is uncertainty in the monetization path of large models in the search field, which may bring some resistance in the future.
In a report on the 7th, analysts at JPMorgan wrote, "Although it is still in its early stages of development, the launch of Gemini symbolizes a significant innovation by Google in the second year of widespread commercialization and dissemination of generative AI."
Currently, it seems that how Google commercializes Gemini in its overall business, especially in its most important search business, is a focus of attention on Wall Street. Currently, Google plans to license Gemini to customers through Google Cloud later this month and integrate it with other products in Google services in the coming months, but has not yet announced its subsequent commercialization strategy.
Analysts at Wells Fargo say that the launch of Gemini should be enough to quell the debate about "where Google should go in the AI field", but the key issue is how Google can use Gemini for profit: "In short, I think Google has proven that they still have some competitiveness."
KeyBanc analysts also stated that Gemini is the "peak" of Google's numerous AI announcements this year, but it will take time for AI to have a positive impact on Google's performance growth and profitability: "Gemini is still working hard to enter core products such as search, so we suggest patiently observing its impact."
Unlike the overall bullish view on Wall Street, there are voices in the technology sector pointing out that Gemini may have doubts about "exaggerated advertising".
Shortly after Gemini was launched on the 6th, some netizens pointed out some inappropriate aspects in the promotional materials. For example, when Google claims that the MMLU score of Gemini is higher than GPT-4, it shows that the GPT-4 score is 86.4%. However, according to Google's 60 page technical report, there are“ cot@32 ”The small word annotation indicates that it used the thought chain suggestion technique, tried 32 times, and selected the best result from them. As a comparison, GPT-4 provides 5 examples of silent word techniques. Under this standard, Gemini Ultra's test result is actually 83.7%, lower than GPT-4's 86.4%.
If using the same method cot@32 Although the score is still lower than that of Gemini Ultra, the GPT-4 scoring rate reached 87.29%.
Comparison of MMLU test scores between Gemini and GPT under various conditions. Source: Google
If, as Jeff Dean, the chief scientist of Google DeepMind, responds, this writing style is only meant to show a comparison between two different methods, the questioning of Gemini test videos becomes even more difficult to refute.
After launching Gemini, Google released a six minute demonstration video showcasing some interesting interactions between testers and Gemini, including having Gemini recognize images and describe them in multiple languages, using a map to design intelligence quizzes, and playing cup games and reasoning games with Gemini, among others. Throughout the process, Gemini's reaction speed was very fast, and he also generated audio and pictures to assist in answering, using some colloquial and even humorous expressions, which can be said to be eye opening.
However, soon some netizens discovered an issue from the disclaimer in the opening text of the video, believing that it may imply that the video displayed was carefully selected and good results, not recorded in real-time, but edited. Subsequently, Google explained the multimodal interaction process in a blog post and indirectly acknowledged that only by using static images and multiple prompts to piece together can the effects in the demonstration video be achieved.
For example, in the article, Google acknowledges that unlike the quick response to guessing gestures in videos, Gemini will only draw conclusions about guessing games when all three gestures are simultaneously displayed to Gemini and it is indicated that they are a game. Official website screenshot
An analysis suggests that this is completely different from what Google suggests in the video, because from the video, Gemini can observe the world around it in real time and respond, and users can have smooth voice conversations with Gemini. Wharton Business School professor Ethan Mollick also demonstrated on the X platform that if static images and multiple prompts are used, Gemini's performance can be replicated through ChatGPT Plus.
Ethan Molik showed ChatGPT Plus multiple screenshots from Google's demo video, and ChatGPT Plus could provide a similar answer.
After questioning the fermentation, Eli Collins, Vice President of Google DeepMind Products, responded to foreign media that the duck drawing demonstration in the video (drawing a simple duck stroke, Gemini can provide the correct explanation for each step) is indeed a research level feature, and at least it has not yet appeared in Google's actual products.
Oriol Vinyals, Vice President of Research and Deep Learning at Google DeepMind, also posted a lengthy article on the X (formerly Twitter) platform, Explained how the team made the video: "All user prompts and outputs in the video are real, only shortened for simplicity." Viniars also stated, "The video shows what a multimodal user experience built with Gemini looks like. We did this to motivate developers."
However, Viniars' response has sparked more controversy. A netizen commented, "If you want to motivate developers, why not post real content? Shortened user prompts are not considered 'real'. This is not sincere and misleading."
A Google employee revealed to foreign media that they believe this video depicts an "unrealistic picture.". Some employees expressed that they were not surprised by this exaggerated demonstration because they were accustomed to companies exaggerating marketing their products to some extent: "I think most employees who have used big language modeling techniques know to have reservations about everything (in the demonstration)."
Some foreign media believe that Google's "massive bureaucratic system and product managers at all levels make it unable to launch products as agile as OpenAI until now.". For a society that is dealing with the impact of AI transformation, this is not a bad thing. But for Google's recent rapid progress, a certain degree of reservations should be maintained.

浏览过的版块