Udio

Last updated
Udio
Developer(s) Udio
Initial release2024
Type Generative artificial intelligence
Website udio.com

Udio is a generative artificial intelligence model that produces music based on simple text prompts. It can generate vocals and instrumentation. Its free beta version was released publicly on April 10, 2024. Users can pay to subscribe monthly or annually to unlock more capabilities.

Contents

Founded in December 2023 by a team of former researchers for Google DeepMind headed by Udio's CEO, David Ding, the program received financial backing from the venture capital firm Andreessen Horowitz and musicians will.i.am and Common, among others. Critics praised its ability to create realistic-sounding vocals while others raised concerns over the possibility that its training data contained copyrighted music.

History

Udio was created in December 2023 by a team of four former researchers for Google DeepMind, including Udio's CEO David Ding, Conor Durkan, Charlie Nash, Yaroslav Ganin, as well as Andrew Sanchez. [1] [2] The venture capital firm Andreessen Horowitz; the music distributor UnitedMasters; musicians will.i.am, Tay Keith, and Common; investor Kevin Wall; Instagram cofounder Mike Krieger; and DeepMind researcher Oriol Vinyals all provided financial backing for Udio, and it was valued at $10 million in seed funding. [3] It spent several months in a closed beta phase before being publicly released in its beta phase on April 10, 2024 on the Udio website. [4] As of April 2024, it allows users to generate 600 songs per month for free. [5] Sanchez described it as "enabl[ing musicians] to create great music and ... to make money off of that music in the future". [1] Udio's release followed the releases of other text-to-music generators such as Suno AI and Stability Audio. [6]

Udio was used to create "BBL Drizzy", a song that went viral in the context of the Drake–Kendrick Lamar feud, with over 23 million views on Twitter. [7]

Capabilities

Udio bases the songs it creates on text prompts, which can include their genre (including barbershop quartet, country, classical, hip hop, German pop, and hard rock, among others), lyrics, story direction, and other artists to base their sound on. Its lyrics are created with a large language model (LLM), while the process used to generate the music itself, as of April 2024, has not been disclosed. [8] The program generates two songs based on the prompts and users can "remix" their songs with further text prompts. [9] Songs are first generated as roughly 30 second-long pieces, and can be extended by additional 30 second increments. [5]

Reception

Mark Hachman, the senior editor of PC World , compared Udio to AI art generators and praised its ability to turn "a few rather poor lyrics" into a "rather catchy" song, also calling the vocals it generated "incredibly realistic and even emotional". [5] Sabrina Ortiz of ZDNET described the songs it generated as being "impressive" and sounding "as though they were produced professionally". She also called them "fuller and richer" than those of other text-to-music generators, which she said it had "more personalization options" than. [4] Tom's Guide 's Ryan Morrison wrote that Udio had "an uncanny ability to capture emotion in synthetic vocals" and was the only AI music generator "to have captured the passion, pain and spirit of a vocal performance". [10] He added that the program was geared toward "people with no or minimal musical ability". [2] Brian Hiatt of Rolling Stone wrote that Udio was "more customizable but also perhaps less intuitive to use" than Suno AI and added that "some early users have suggested that on average, Udio's output may sound crisper than Suno's". [1]

For Ars Technica , Benj Edwards wrote that Udio's generation capability was imperfect and "less impressive" than Suno AI's, noting that its songs were substantially shorter than Suno AI's. He also called the songs it produced "half-baked and almost nightmarish". [8] In response to the company's announcement of Udio's beta release on Twitter, Telefon Tel Aviv member Joshua Eustis tweeted that Udio was "an app to replace musicians" and called into question the data that it used. Udio has also been criticized online as "soulless" and for having the potential to create audio deepfakes. [9] [6] Lucas Ropek of Gizmodo stated that Udio was "full of acoustical nonsense" and that its songs were "extraordinarily bad". [11]

Critics of Udio have questioned what data was used to train it and if that data consisted of copyrighted music. Rolling Stone wrote that there was "substantial reason to believe" that both Udio and Suno AI were trained with copyrighted music, while Benj Edwards of Ars Technica wrote that its training data was "likely filled with copyrighted material". [8] [9] Udio does not directly recreate copyrighted songs if prompted. [5] Ding has stated that Udio has "extensive automated copyright filters" and that the company is "continually refining [its] safeguards". [6]

See also

Related Research Articles

<span class="mw-page-title-main">Opera (web browser)</span> Freeware web browser

Opera is a multi-platform web browser developed by its namesake company Opera. The current edition of the browser is based on Chromium. Opera is available on Windows, macOS, Linux, Android, and iOS. There are also mobile versions called Opera Mobile and Opera Mini. Opera includes a feature called Opera News, which is a news aggregator app that utilizes AI technology.

Music and artificial intelligence is the development of music software programs which use AI to generate music. As with applications in other fields, AI in music also simulates mental tasks. A prominent feature is the capability of an AI algorithm to learn based on past data, such as in computer accompaniment technology, wherein the AI is capable of listening to a human performer and performing accompaniment. Artificial intelligence also drives interactive composition technology, wherein a computer composes music in response to a live performance. There are other AI applications in music that cover not only music composition, production, and performance but also how music is marketed and consumed. Several music player programs have also been developed to use voice recognition and natural language processing technology for music voice control. Current research includes the application of AI in music composition, performance, theory and digital sound processing.

A sticker is a detailed illustration of a character that represents an emotion or action that is a mix of cartoons and Japanese smiley-like "emojis" sent through instant messaging platforms. They have more variety than emoticons and have a basis from internet "reaction face" culture due to their ability to portray body language with a facial reaction. Stickers are elaborate, character-driven emoticons and give people a lightweight means to communicate through kooky animations.

<span class="mw-page-title-main">OpenAI</span> Artificial intelligence research organization

OpenAI is an American artificial intelligence (AI) research organization founded in December 2015, researching artificial intelligence with the goal of developing "safe and beneficial" artificial general intelligence, which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As one of the leading organizations of the AI boom, it has developed several large language models, advanced image generation models, and previously, released open-source models. Its release of ChatGPT has been credited with starting the AI boom.

<span class="mw-page-title-main">Google Assistant</span> AI-powered digital assistant from Google

The Google Assistant is a virtual assistant software application developed by Google that is primarily available on mobile and home automation devices. Based on artificial intelligence, The Google Assistant can engage in two-way conversations, unlike the company's previous virtual assistant, Google Now.

<span class="mw-page-title-main">Artificial intelligence art</span> Machine application of knowledge of human aesthetic expressions

Artificial intelligence art is any visual artwork created through the use of an artificial intelligence (AI) program.

<span class="mw-page-title-main">DALL-E</span> Image-generating deep-learning model

DALL·E, DALL·E 2, and DALL·E 3 are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as "prompts".

<span class="mw-page-title-main">Midjourney</span> Image-generating machine learning model

Midjourney is a generative artificial intelligence program and service created and hosted by the San Francisco–based independent research lab Midjourney, Inc. Midjourney generates images from natural language descriptions, called prompts, similar to OpenAI's DALL-E and Stability AI's Stable Diffusion. It is one of the technologies of the AI boom.

<span class="mw-page-title-main">Stable Diffusion</span> Image-generating machine learning model

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. It is considered to be a part of the ongoing artificial intelligence boom.

<span class="mw-page-title-main">Text-to-image model</span> Machine learning model

A text-to-image model is a machine learning model which takes an input natural language description and produces an image matching that description.

<span class="mw-page-title-main">ChatGPT</span> Chatbot developed by OpenAI

ChatGPT is a chatbot developed by OpenAI and launched on November 30, 2022. Based on large language models (LLMs), it enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive user prompts and replies are considered at each conversation stage as context.

Prisma Labs is a software company based in Sunnyvale, California that is known for developing Prisma and Lensa.

Devi Parikh is an American computer scientist.

<span class="mw-page-title-main">Generative artificial intelligence</span> AI system capable of generating content in response to prompts

Generative artificial intelligence is artificial intelligence capable of generating text, images, videos, or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.

<span class="mw-page-title-main">AI boom</span> Ongoing period of rapid progress in artificial intelligence

The AI boom, or AI spring, is an ongoing period of rapid progress in the field of artificial intelligence (AI). Prominent examples include protein folding prediction led by Google DeepMind and generative AI led by OpenAI.

<span class="mw-page-title-main">Microsoft Copilot</span> Chatbot developed by Microsoft

Microsoft Copilot is a chatbot developed by Microsoft and launched on February 7, 2023. Based on a large language model, it is able to cite sources, create poems, and write songs. It is Microsoft's primary replacement for the discontinued Cortana.

In the 2020s, the rapid advancement of deep learning-based generative artificial intelligence models are raising questions about whether copyright infringement occurs when the generative AI is trained or used. This includes text-to-image models such as Stable Diffusion and large language models such as ChatGPT. As of 2023, there are several pending U.S. lawsuits challenging the use of copyrighted data to train AI models, with defendants arguing that this falls under fair use.

<span class="mw-page-title-main">Suno AI</span> Music generator

Suno AI, or simply Suno, is a generative artificial intelligence music creation program designed to generate realistic songs that combine vocals and instrumentation, or are purely instrumental. Suno has been widely available since December 20, 2023, after the launch of a web application and a partnership with Microsoft, which included Suno as a plugin in Microsoft Copilot.

<span class="mw-page-title-main">Sora (text-to-video model)</span> Text-to-video model by OpenAI

Sora is an upcoming generative artificial intelligence model developed by OpenAI, that specializes in text-to-video generation. The model accepts textual descriptions, known as prompts, from users and generates short video clips corresponding to those descriptions. Prompts can specify artistic styles, fantastical imagery, or real-world scenarios. When creating real-world scenarios, user input may be required to ensure factual accuracy, otherwise features can be added erroneously. Sora is praised for its ability to produce videos with high levels of visual detail, including intricate camera movements and characters that exhibit a range of emotions. Furthermore, the model possesses the functionality to extend existing short videos by generating new content that seamlessly precedes or follows the original clip. As of April 2024, it is unreleased and not yet available to the public.

<span class="mw-page-title-main">BBL Drizzy</span> 2024 instrumental by Metro Boomin

"BBL Drizzy" is an instrumental track by American record producer Metro Boomin. The track was released on May 5, 2024 in response to the ongoing Drake–Kendrick Lamar feud. It samples an artificial intelligence generated track of the same name by comedian King Willonius. It is the first notable example of AI sampling in mainstream hip-hop music, according to Billboard.

References

  1. 1 2 3 Hiatt, Brian (April 10, 2024). "AI-Music Arms Race: Meet Udio, the Other ChatGPT for Music". Rolling Stone . Retrieved April 15, 2024.
  2. 1 2 Morrison, Ryan (April 10, 2024). "Meet Udio — the most realistic AI music creation tool I've ever tried". Tom's Guide. Retrieved April 15, 2024.
  3. Tencer, Daniel (April 10, 2024). "New AI-powered 'instant' music-making app Udio raises $10m; launches with backing from will.i.am, Common, UnitedMasters, a16z". Music Business Worldwide . Retrieved April 15, 2024.
  4. 1 2 Ortiz, Sabrina (April 10, 2024). "Is Udio really the best AI music generator yet? I put it to the test and so can you". ZDNET . Retrieved April 15, 2024.
  5. 1 2 3 4 Hachman, Mark (April 11, 2024). "Udio's AI music is my new obsession". PC World . Retrieved April 15, 2024.
  6. 1 2 3 Nuñez, Michael (April 11, 2024). "Former Google DeepMind researchers launch AI-powered music creation app Udio". VentureBeat . Retrieved April 15, 2024.
  7. Lawrence, Andrew (May 9, 2024). "'I bet Drake heard it and laughed': BBL Drizzy is the real winner of the Drake-Kendrick feud". The Guardian . Retrieved May 12, 2024.
  8. 1 2 3 Edwards, Benj (April 10, 2024). "New AI music generator Udio synthesizes realistic music on demand". Ars Technica . Retrieved April 15, 2024.
  9. 1 2 3 Eede, Christian (April 12, 2024). "'Game-changing' new app generates music from text prompts". DJ Mag . Retrieved April 15, 2024.
  10. Morrison, Ryan (April 11, 2024). "Udio is a game changer for AI music — 9 best prompts to try now". Tom's Guide . Retrieved April 15, 2024.
  11. Ropek, Lucas (April 11, 2024). "Dune, the Broadway Musical and 8 Other Brain-Dead Songs From Udio's AI Music Generator". Gizmodo . Retrieved April 15, 2024.