Nvidia brings a new AI model to 'revolutionize' the audio industry: capable of creating music and modifying vocals
芊芊551
发表于 4 天前
121
0
0
According to reports, Nvidia has developed a new type of artificial intelligence (AI) model that can create sound effects, change people's pronunciation, and generate music using natural language prompts.
This model is named Fugatto, which stands for Founding Generative Audio Transformer Opus 1, and is a research project. Nvidia stated that it will not announce any plans to release this technology, but it may have a wide-ranging impact on industries ranging from music, entertainment to translation services.
Bryan Catanzaro, Vice President of Applied Deep Learning Research at NVIDIA, said in an interview, "The most exciting thing about Fugatto is that it has a model that you can ask it to make sound in some way, which really opens up your imagination of its application scope
He further explained that other models on the market, some can synthesize speech, some can add sound effects to music, but Fugatto can do all of them. Catanzaro said that it can be seen as a supplement to video and image generation models such as Stability AI's Stable Video Diffusion or OpenAI's Sora.
The most fundamental improvement here is... we are able to use language to synthesize audio, which I believe opens up new prospects for tools that people can use to create amazing audio, "he added.
According to Nvidia, Fugatto is the first basic model with emerging features, which means it can mix trained elements and follow "free-form instructions".
Specifically, the model can generate audio through standard text prompts and also handle the audio files you upload. So, if you have a document of someone speaking, you can translate that person's words into another language while making it sound like their voice. You can also choose a simple tune to make it sound like an orchestral performance, or add different beats to the music.
In addition, you can also upload a document for the model to read aloud in any voice you like. More importantly, you can instruct the model to produce sounds with emotional components.
However, Catanzaro also added that this model is not always perfect. Moreover, just like models that generate images and videos, Fugatto also raises concerns among artists, sound engineers, and professionals in related fields. But Catanzaro pointed out that his original intention was to hope that this technology could help musicians.
I hope this is a new tool for artists to explore. "" I think audio has always been a productive field of exploration. You know, when we acquire new audio tools, sometimes we acquire new forms of music, "he said.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Tesla Model 3/Y's 5-year, interest free car purchase campaign extended until the end of the year
- Tesla Model 3/Y's 5-year, interest free car purchase campaign extended until the end of the year
- Microsoft denies using user data to train artificial intelligence models
- Musk's xAI supercomputer will expand tenfold! 1 million GPUs are worth Nvidia opening a subsidiary
- Qifu Technology: Helping to improve the security level of digital finance and the practical application of financial models
- Nvidia establishes AI R&D center in Vietnam and acquires VinBrain
- Nvidia's US stock experienced a short-term pre-market decline, dropping 2%
- Guosheng Securities: Behind Nvidia's financial report lies the long logic of AI narrative
- Tech Weekly | Nvidia's sales nearly double, Apple announces developer revenue
- Nvidia CEO: Nvidia is rapidly certifying Samsung's AI memory chips
-
10月末に2800ドルのマイルストーンを突破した後、国際金価格は短い調整を経た。 11日のニューヨーク商品取引所で来年2月に引き渡されたCOMEX金先物は1.5%近く上昇し、2750ドルの関門を再び奪還し、最新のインフレデ ...
- 什么大师特
- 6 小时前
- 支持
- 反对
- 回复
- 收藏
-
米大統領選後の株式市場の上昇は年末まで続いており、ウォール街の大物たちが叫んだ来年の目標価格も年々上昇しているが、上昇を追う際にはすべての慎重さを捨ててはならないと警告するアナリストも少なくない。 一 ...
- SOHU
- 昨天 11:47
- 支持
- 反对
- 回复
- 收藏
-
12月10日夜、米株が取引を開始し、市場の注目はグーグルに集中し、終値までにグーグルA(GOOGL)は5.59%上昇し、185.17ドルだった。その時価総額は一夜にして1120億ドル(約8120億元)も大幅に増加した。 情報面で ...
- 内托体头
- 昨天 11:15
- 支持
- 反对
- 回复
- 收藏
-
①北京時間の今夜21時30分に発表された米国の11月CPIデータを、「2024年最後の重量級の米国経済指標」にたとえても、誇張ではないようだ。②FRBが来週12月の金利決定会合を開催するにあたり、今晩のCPIもFRBが金利 ...
- 不正经的工程师
- 昨天 10:29
- 支持
- 反对
- 回复
- 收藏