Spotify has bigger plans for the technology behind the new AI DJ feature after seeing the positive reaction of consumers to the new feature. Launched ahead of the company’s Stream On event in LA last week, AI DJ curates a personalized selection of music accompanied by spoken commentary delivered by a realistic-sounding, AI-generated voice. . But under the hood, the feature uses the latest in AI technologies and Large Language Models, as well as generative voice – all of which is on top of Spotify’s existing investments in -personalize and machine learning.
These new tools should not be limited to one segment, Spotify believes, so it is currently experimenting with other applications of the technology.
Although the highlight from Spotify’s Stream On event was the revamp of the mobile app, which now focuses on TikTok-like discovery feeds for music, podcasts, and audiobooks, AI DJ is now a prominent part of the bag. experience with streaming services. Introduced in late February to Spotify’s Premium subscribers in the US and Canada, DJ is designed to get to know users so well that it can play whatever you want to hear at the press of a button.
With the redesign of the app, the DJ will appear at the top of the screen under the Music sub-feed for subscribers, serving as a lean-back way to stream favorite music and as a way to push free upgrade users.
To create the commentary that accompanies the music streamed by the DJ, Spotify says it uses its own knowledge base and the insights of its in-house music experts. Using OpenAI’s Generative AI technology, DJ is able to scale their commentary to the app’s end users. And unlike ChatGPT, which tries to create answers by distilling information found on the wider web, Spotify’s more limited database of music knowledge ensures that the DJ’s commentary will be the same and accurate.
The actual music selections chosen by the DJ come from the current understanding of a user’s preferences and interests, mirroring what would have been previously programmed into personalized playlists, such as Discover Weekly and others.
AI DJ’s voice, meanwhile, was created using technology Spotify acquired from Sonatic last year and is based on Spotify’s Head of Cultural Partnerships Xavier “X” Jernigan, host of the now-defunct morning show podcast on Spotify, “The Get Up.” Surprisingly, the voice sounds very realistic and not at all robotic. (During Spotify’s live event, Jernigan spoke alongside his AI double and the differences were hard to tell. “I can listen to my voice all day,” he joked).
“The reason it sounds good – that’s actually the goal of the Sonatic technology, the team we got. It’s about the emotion of the sound,” explained Spotify’s Head of Personalization, Ziad Sultan, in an interview talk to TechCrunch after Stream On wrapped. “When you hear AI DJ, you hear where the breathing stops. You will hear different intonations. You can hear the excitement for certain types of genres,” he said.
A natural-sounding AI voice is nothing new, of course — Google impressed the world with its own human-sounding AI creation years ago. But its implementation within Duplex led to criticism, because the AI dialed businesses for the end user, initially without disclosing that it was not a real person. There shouldn’t be the same concern on Spotify’s part, because it’s called “AI DJ.”
To make Spotify’s AI sound natural, Jernigan went into the studio to create high-quality voice recordings, while working with sound technology experts. There, he was instructed to read different lines using different emotions, which were then fed into the AI model. Spotify didn’t say how long this process will take, or go into detail, saying the technology is evolving and referring to it as the “secret sauce.”
“From high quality input with many different permutations, [Jernigan] then there’s no need to say anything – now it’s all AI-generated,” Sultan said in the generated voice. However, Jernigan occasionally pops into the Spotify writers’ room to give feedback on how he’s reading the a line to ensure that he continues to input.
But while AI DJ is built using a combination of Sonatic and OpenAI technology, Spotify is also investing in in-house research to better understand the latest in AI and Large Language Models.
“We have a research team working on the latest language models,” Sultan told TechCrunch. It has several hundred working on personalization and machine learning, in fact. In the case of AI DJ, the team is using the OpenAI model, Sultan said. “But, in general, we have a large research team that understands all the possibilities of many Language Models, of the full voice, of the full personalization. It is moving fast,” said “We want to be known for our AI expertise.”
Spotify may or may not use its own in-house AI tech to power future developments, however. It may decide that it makes more sense to work with a partner, as OpenAI is doing now. But it’s too easy to say.
“We always publish papers,” Sultan said. “We will invest in the latest technologies – as you can imagine, in this industry, LLMs are the technology. So we will develop skills.”
With this foundational technology, Spotify can push forward in other areas involving AI, LLM, and generative AI tech. What areas will be part of the consumer products, the company is yet to say. (We hear that a ChatGPT-like chatbot, however, is one of the options being experimented with.
“We have not yet announced exact plans for when we will expand into new markets, new languages, etc. But it is a technology that is a platform. We can do it and we hope to share more as it develops,” Sultan said.
Early consumer feedback for AI is promising, Spotify said
The company didn’t want to create a full suite of AI products because it wasn’t sure how consumers would react to DJ. Do people want an AI DJ? Will they play the part? None of that is clear. After all, Spotify’s voice assistant (“Hey Spotify“) has suffered from a lack of adoption.
But there are early signs that the DJ side may be doing well. Spotify tested the product internally with employees before launch, and usage and re-engagement metrics were “very good.”
Public adoption, so far, has been consistent with what Spotify has seen internally, Sultan told us. That means there is potential to spin up future products using the same basic foundation.
“People spend hours every day with this product … it helps them with choices, with discovery, it tells them the next music they should listen to, and explains to them why … it’s seen as very positive, it’s emotional,” Sultan said.
In addition, Spotify shared that, on the days users watched, they spent 25% of their time listening to the DJ, and more than half of the first listeners returned to use the feature the next day. These metrics are still early, however, as the feature is not 100% released in the US and Canada at the moment. But they promise, the company believes.
“I think it’s a wonderful step to build a relationship between really valuable products and users,” said Sultan. But he warned that the challenge ahead is “finding the right application and then doing it right.”
“In this case, we say it’s an AI DJ for music. We created the writers’ room for it. We put it in the hands of the users to do the exact job it needs to do. It works very well . But it’s definitely fun to dream about what else we can do and how fast we can do it,” he added.