Gustav Söderström has worked at Spotify for a long time; his first big project was leading the launch of its mobile app back in 2009. That makes him the perfect company leader to talk to about Spotify’s recent redesign, which introduces a visual, TikTok-like feed for discovering new content on the app’s homepage. As his boss, CEO Daniel Ek, put it last week, it’s “the biggest change Spotify has undergone since we introduced mobile.”
With the title of co-president and chief product and technology officer, Söderström is responsible for not only how Spotify looks and feels but also all of the AI work happening behind the scenes to power its increasingly important recommendations. According to Söderström, it turns out that improving those recommendations is actually at the heart of the big redesign. “I think companies that don’t have an efficient user interface for a machine learning world are not going to be able to leverage machine learning,” he told me on the newest episode of Decoder.
We spoke last week at Spotify’s Los Angeles headquarters during its Stream On event, a “State of the Union”-type presentation where new tools for podcasters, distribution deals with the likes of NPR and Patreon, and stats like Spotify now having more than 500 million monthly listeners were announced. As someone who shares Decoder’s love of org charts, Söderström talked in-depth about the reason for Spotify’s recent company restructuring that saw him elevated to the role of co-president alongside Alex Norström, who manages the business side of things. (Paid subscribers to my Command Line newsletter got an exclusive preview of our conversation last week. The next edition drops this Thursday.)
Söderström and I also spoke about the pressure that comes with doing big redesigns of widely used apps like Spotify, the cost cutting and shift to focus on operational efficiency that’s happening across the tech industry, and why it’s a good thing there are no lion pits in Sweden. Hope you enjoy.
The following transcript has been lightly edited for clarity.
Gustav Söderström is the co-president and the chief product and technology officer of Spotify. Welcome to Decoder.
Thank you so much. It’s great to be here.
I’m here with you at Spotify’s headquarters in LA where you all just hosted your big annual Stream On event. You announced a bunch of things, namely the biggest redesign of Spotify in a long time. We’re going to get into all of that, but first, this is an episode of Decoder. While I’m certainly not Nilay, I do have some Decoder questions for you. Nilay and I actually share a love of org charts, believe it or not.
I do too, actually. Let’s do it.
I figured you did. I have a feeling that this will naturally lead into the news you all announced today. First, I’d love to start with your story. You’ve been at Spotify for a long time — 14 years. If I’m correct, I think the first big product you worked on was bringing Spotify to mobile.
That’s right. The paid mobile application.
Right. The company had about 30 employees at the time. Could you give me a sense of your time at Spotify and how your role has evolved over all these years?
Yeah, sure. I didn’t come to Spotify as part of the founding team. I came when there were about 30 people and the desktop app had already existed for free in Sweden. Spotify was competing with piracy, and since piracy was free, we had to be free. At that time, with the government-provided internet in Sweden, that had very low latency, high bandwidth, that illusion worked on desktop. It was actually faster to play music using Spotify than it was to use Napster.
My thing at the time was mobile. Back then, it was like a special skill that was in demand.
I had done a few startups and sold a few companies to American companies like Yahoo, so my thing at the time was mobile. That’s not a thing anymore — if you don’t do mobile then you basically don’t do internet. But back then, it was like a special skill that was in demand, so I was lucky to have that.
Daniel [Ek, Spotify CEO] brought me on to the team so we could figure out what Spotify mobile should be. It was interesting. The first assumption, specifically from end users, was that the whole thing with Spotify was free streaming. That illusion of streaming over data didn’t work with the mobile networks we had at the time, like Edge or GPRS, once we started thinking about mobile. It didn’t work technically, and it certainly didn’t work financially. You’d get an SMS from your provider after half a song that said, “You’re out of data.”
We had to figure out what the experience was. Pretty quickly, we realized that it would have to work offline. We had to fix new licenses from labels for this offline mode that didn’t exist, that way you could encrypt and keep music locally available with a key that the user wasn’t supposed to be able to decipher and that would then expire.
What was interesting for me was that there was a lot of technology innovation in making that performance seamless. When you added a file on your desktop computer, it would be silently and seamlessly synchronized so you could have it offline. It was also very much a business challenge with the labels to figure out what kind of license we needed and what that would cost. That’s what I came in to do: to head up mobile.
Over the years, I kind of label myself as a product guy, but I’m very interested in the combination of product and business models. I think there are product innovations that are interesting, and almost all the really big product innovations happen to coincide with new business models. They go hand in hand. I think that has been true for Spotify as well. So, I headed up mobile and then started heading up all of product for Spotify. A few years later, I also started heading up technology in this CPO/CTO role for Spotify. My title now is co-president, but really, my responsibilities are those of a chief product officer and a chief technology officer.
Based on my understanding, it’s unusual to have the CPO also be the CTO. Have you seen that model at other companies?
It’s a good question. I don’t know. Maybe it is unusual. My background is in computer science and electrical engineering. I am genuinely interested in technology for the sake of technology, but I’ve also been a CEO and founder of my own company. I have that mindset, I guess. I enjoy technology and spend a lot of time on it. I’m not a great developer, unfortunately. No one lets me touch code at Spotify. Not anymore, at least.
Let’s talk about the reorg that was part of the layoffs you announced on January 23rd. Can you walk me through the reasoning for that reorg and what it is hoping to achieve? When you were named co-president along with Alex [Norström], what was going on there?
Yeah, absolutely. I know you and Nilay are interested in org charts, and so am I. You follow Steven Sinofsky and Microsoft back and forth between your functional business units to see what works. The general advice is that there is no right or wrong org, but they do say you should probably stick to one. I have a slightly different version of that, which is that you can’t win with an org. There is no org that is good at everything. You could be in the worst place, where the org is bad at what’s really important to the company, yet good at the unimportant. We strive for the org to at least be good at the important things and kind of suck at the slightly less important things. That’s how we think about it.
I try to be clear that you can’t win on all dimensions. It’s always a trade-off. We wanted to achieve a few things with this organization that we have now. Part of the reason for the layoffs and the org change is the macroclimate in technology. Basically, funding and money was incredibly cheap for a very long time. People are now complaining that technology companies were exuberant. The other way to say it is that when money was cheap, it made all the sense in the world to be very aggressive and grab as many opportunities as possible. So we did. We got to half a billion users and we were number one in music and podcasts, and so forth. The world has changed and that’s not true anymore. Money and capital is very expensive. Spotify has to change like everyone else.
We need to become a much more efficient, focused company, in addition to being a great product. I mean, we’ve always known that the only way to build a great product for the very long term is also to be a great business. You can loan money for a while and build a great product, but you want to be self-sufficient in the long term. There is no job guarantee except being profitable. That’s the way to think about it, so that’s where we are — and by we, I mean all of technology. It’s not unexpected. This is what all new businesses look like. You have this innovation phase with startups and then it starts to mature. That’s where I think we are.
That’s part of the reason, but there is a more intrinsic reason for this. We presented this thing called the Spotify machine on investor day last summer, and we told the investor community, even though it was open to everyone, that they should think of Spotify as a music application. To build that music application, we had to build a lot of technology. We had to build a consumer application where you can find content and subscribe to it, we had to build at least two monetization modalities with an advertising tier and a subscription tier, and we had to build recommendation engines for content, and so forth.
Most of our growth in podcasting is from creating new podcast users, rather than stealing existing ones from other podcast services.
Then as we went into podcasts, we wanted to leverage all that infrastructure. Instead of building a separate app and starting over with zero users, we started with what was already hundreds of millions of music listeners with the bet of being able to upsell them on podcasting, which wasn’t that big at the time. Most of the growth in podcasting we’ve had is by actually creating new podcast users, rather than stealing existing ones from other podcast services. We made a bet on doubling down on our own audience and using our own distribution to our music listeners for podcasting.
Now we’re doing the same thing with audiobooks and bringing another format to our existing audience. That gives us an intrinsic distribution advantage. We already have the audience and know their tastes and habits around podcasts, which are very similar to audiobooks. It even turns out that music listening was predictive for podcast listening, which surprised me.
But it comes with a challenge. If you’re going to build this thing into the same application, you’re going to make it back to trade-offs. The trade-off is that you can’t just make the application more complicated. There are benefits to that, but there are also drawbacks. From a pure designer product point of view, it’s much easier to build a separate app because you can optimize for that 100 percent. On the flip side, you’re going to start with zero users. When we looked at podcasts, I think Apple Podcasts had 98.5 percent market share. It wasn’t that there weren’t a huge amount of great podcast apps out there or that there weren’t better podcast apps than Apple’s. The problem was that no one used them. Distribution seemed to be the biggest problem, which is why we chose that trade-off.
So why am I saying this? Well, I’m saying this because if we’re going to build a Spotify machine, our biggest challenge is actually to make these different content types work inside a single application, while also keeping it easy to use and not complex.
I talk about two orgs extremes, Amazon on one side and Apple on the other. This isn’t necessarily true, but if you stereotype a little bit, Amazon is known for parallel teams. You run in parallel. You divide and conquer. You have the two-pizza teams and you’re unblocked from reaching the consumer. That also results in a consumer experience where you might see three search boxes from three different teams on the screen at the same time. But it works. It’s a trillion-dollar company. It’s not like there’s nothing wrong with it, but it doesn’t optimize for simple user experience, it optimizes for speed.
I think Apple is the opposite. They ship much, much slower than Amazon, but no one gets to put their own search box there. It is centrally synchronized. They managed to build something very complex that still feels like it was built by very few people for a single user. We chose to adopt more of that. We needed to synchronize the company and eat that complexity of music, podcasts, and audiobooks — and potentially other things — for the user, instead of just shipping our org chart to the user, saying, “You figure it out.”
We built this org where we have three horizontal layers. We have a platform layer, which is the Spotify technology platform. We have the Spotify experience layer, which is all the applications, surfaces, mobile apps, cars, and desktops owned by a single person. Then we have a personalization layer. How do you choose between recommending a song, a podcast, or an audiobook for the same user at a certain moment? What is best for the user and for the company? We have these three horizontals that everything has to go through. It’s a synchronization function that actually slows things down, but then these people are forced to eat all that complexity that would otherwise end up with the end user.
So these vertical businesses — the podcast business, the music business, and the audiobook business — can’t actually just go and ship stuff to the user. They have to go through these synchronization functions. That was the big org change we haven’t spoken much about externally. It is all in service of being able to do this without drastically increasing the complexity for users, to keep it simple. So far, we think it’s working. We have quantitative metrics that say consumption on Spotify is way higher than any competitor, even though they only do one thing, like music.
To summarize what you just said, you’ve basically made the trade-off, “We may ship slower as an organization, but that’s okay because we’re optimizing for simplicity.”
Yes. That is what we’re saying. I think it’s true in those companies that it’s a trade-off you have to make. You can’t let everyone run fast in parallel and be simple at the same time. I often say that people come and ask me for more autonomy and more swim lanes. They want to own responsibility. This is a very natural thing in companies, especially American companies I find. What I say is, “No, we’re not doing competitive swimming, we’re doing synchronized swimming.” It’s a much harder sport, much harder to execute, but much more beautiful to look at when it works. Synchronized swimming that doesn’t work is just a mess. So it’s a hard thing, but it is what we’re trying to do.
I’m glad you brought up the American angle actually, because I did notice Daniel’s note about the reorg — he’s the CEO of Spotify — the word “efficiency” being used. Mark Zuckerberg has used it. It’s kind of become a buzzword in the tech community this year. I think it’s kind of the theme of the year, actually. Spotify is based in Sweden, but obviously has huge roots in the American tech community. It’s a global company. Is the conversation that we’re having in the American tech world also being had in Europe and other parts where you see Spotify operating? Because interest rates globally are going up, is this a global conversation? Is this more than just America?
It is a global conversation. Basically, the cost of capital has gone up globally, so that drives everything. There are local differences in Europe. As you probably know, electricity is very expensive here. You have slightly different consumer problems, so interest rates may differ slightly, but it’s largely the same conversation, I would say.
Got it. This is the Decoder question. You’ve kind of already walked through this a little bit, but I’m curious how you specifically, as the head of this large organization, make decisions. Do you have a framework? How do you make decisions?
So two things. One is back to the reorg and this change of Alex Norström and I becoming co-presidents. You can choose to either be more synchronized or more asynchronous. Both work, and both produce trillion-dollar companies. We chose to be more synchronized. That’s part of the reason why Daniel chose to concentrate all the functions into Alex and I.
Simplified, you can say I run product and technology and Alex runs the business part and content. But it is only the two of us. Instead of dividing and conquering the problem, while we do have our own direct teams, we don’t have separate team meetings. We have a single meeting every week, where all of our direct reports meet to problem-solve, strategize, and just keep the company running. We’re literally trying to synchronize.
At Spotify, we can actually have all the decision and execution-makers in one room. That’s a competitive advantage.
One of the benefits of Spotify is that it’s a big company, but it’s small compared to the big tech companies. We can actually have all the decision and execution-makers in one room, because it’s only 18 people. That’s a competitive advantage. It’s not possible at Google or Amazon anymore. They would have hundreds of people in that meeting. That is hugely important when you have an escalation or something.
What usually happens in these distributed things is that someone in a meeting says, “I’m blocked on this,” to which someone else says, “Okay, but I need to go to X. Who owns that? We’ll take it offline.” Then days go by. We’re trying to make sure that once someone says, “I’m blocked by Sarah or Yon over here,” that Sarah or Yon are in the room and can say, “Okay, what can we do about that?” That’s part of how we’re trying to be different.
Then on decision-making itself, I have a big passion for a certain kind, which is sort of the Socratic debate. I love debating things. The mantra in Silicon Valley until recently has been that you move fast, break things, ship stuff, and data and code will decide arguments. I don’t subscribe that much to this. I think it has led to a bunch of stuff that wasn’t that good, and I think others are starting to feel the same way. I’ve found that you can actually reason your way through if you have the classical set of a diverse opinion group with really smart people — so the NASA idea. You can reason your way very far, even past huge problems that you would’ve run into.
We tend to reason quite a lot and use different frameworks for that. It’s the Charlie Munger quote that you should run anything through at least three frameworks. If they agree, there’s a good chance that you’re right, because any framework reduces dimensions. If you only use one, there’s a risk that your framework misses a dimension.
We do debate a lot, and that goes back to the whole speed thing. Debating and talking can feel very slow, but I often tell people that talk is cheap. So exactly for that reason, we should do a lot of it. It’s much cheaper than writing code and it’s much cheaper than shipping the wrong thing. This belief that A / B tests are cheap is completely wrong. It’s incredibly expensive to A / B test something.
Which we’ll get into, because you’ve been A / B testing this big redesign that I want to talk about. Last question on orgs stuff, if you and Alex have a disagreement about something super critical that affects the whole company, is Daniel the tiebreaker?
He is, technically. I mean, the benefit of this setup is that we’re not co-CEOs, we’re co-presidents. Daniel is the CEO. There is a tiebreaker, and the company is still founder-led, and I think that is critical. It is very hard for companies to be able to do big, hard things without a founder who can say, “Nope, we’re going to go left here.” It is hard.
He’s still the CEO, and that’s a big competitive advantage for this company. Most companies aren’t CEO-led or founder-led anymore in Silicon Valley. So yes, there is a tie-breaker in case that happens, but Alex and I have worked together for 10-plus years or something. So far that hasn’t happened. The idea is we debate a lot, and that can be heated.
Let’s get into the big news of today and what it means for Spotify as a product. I would say the biggest thing that people are going to notice and probably have strong reactions to is this new feed you showed off today at the event you all hosted. Take me into the decision-making process for what Daniel called, “The biggest change to Spotify since the introduction of mobile.” You’re basically incorporating this TikTok-like endless visual feed to replace the main tab of the app. Why do that?
I think about it this way. When we ask users what they love about Spotify, they say things like personalization and so forth, but if you drill down, what they really say is, “The more new things you show me and the more new things I find, the more I’m going to like this product.” It seems like the love for Spotify is very correlated, and hopefully causated to, the amount of discoveries we drive. You subscribe value and love to the thing that helps you discover new things and have a better life. Discovery is kind of the lifeblood of Spotify.
If you take that lens, then you should expect us to try to do everything to improve discovery. The truth is, evolution has happened the last few years with these feeds that auto-play content. It has converged on the most effective way to evaluate a new piece of content. Not surprisingly, if you’re supposed to evaluate a piece of music by looking at cover art that may or may not be descriptive of the music, click through and start reading titles of songs that often have nothing to do with the music, you don’t even know the genre yet, and then click one of those titles and wait for one and a half minutes to get to the hook — that can’t be the best way to discover music. The best way to discover audio must be through audio.
I don’t think it’s an accident. The world didn’t just randomize into these auto-playing cards. It is evolution. It is the most effective way to quickly understand and evaluate lots of content. We have to respect those innovations. Spotify had a different type of feed, a two-dimensional feed, which was sort of state-of-the-art many years ago, but things change. They have to, and so does Spotify.
If you think about it from the point of “the algorithm” — as they call it, even though it’s not an algorithm, it’s a ton of different systems, but to anthropomorphize a little bit — you need a user interface that can very effectively see what a user likes. Imagine that you have our old homepage, where we would show cover art for a song, you would scroll past that, and the algorithm can’t even know if you saw it. Maybe you saw it; maybe you didn’t. Maybe you saw it, thought about it, evaluated it, and didn’t like it. It can’t know that, so it’s going to repeatedly show it again and again.
If you take one of these cards, when you see this, not only does the algorithm know you saw it, it even knows you heard it. It can know that if you listened to it for a while but then continued, that you should never see it again. One way to think about it is like giving the algorithm glasses. The secret of why some of these products are so good at recommendations is not actually that they have better algorithms. It’s the same algorithms with a more efficient user interface.
Yes, and more signal for the company.
Yeah. So that’s the reason we’re doing it.
We’re also doing something very different, which I saw that you tweeted about today. I mentioned that while the mechanisms are the same as many other companies’, because so far right now, this is the best way that we know of to quickly evaluate lots of content. Why shouldn’t musicians get that, and podcasts, and audiobooks? Why just videos?
“We’re not optimizing for time spent in the feed, but for how much you listen to or save for later.”
We’re doing something very different, where we’re not optimizing for time spent in the feed, but for how much you listen to or save for later. This may sound a little bit like a cop-out, but it’s actually true. You look at the incentives. These other companies, the ads make money by people being in the feed. We actually don’t. First of all, we’re mostly a subscription business, so we make money from retention, and you get retention when people listen to these long things. Even in the free tier, we actually don’t make money in the feed. We make money when someone later listens to that song and there is an ad in between that and the next song.
Our incentives are not to keep you in the feed. Our incentives are to let you evaluate lots of content and put it in your library, so that when you get to that background moment, you have tons of things to listen to. That’s what we want to achieve. We would like you to quite quickly get through this feed because we don’t have a lot of foreground time in Spotify. We’re a background application.
When you open it and feel like you want to find new music, we want that to be incredibly easy. What I want to achieve is that after one of these sessions you feel like, “Oh my God, my library is full of stuff that I want to listen to now, while I’m driving or when I’m running, in the background .” It is a different optimization metric. Back to the tweet, isn’t it antithetical with an endless feed? The feed isn’t actually endless.
So you can get to the bottom of it?
Okay, that’s interesting. I’m glad you brought up incentives and monetization, because, yes, you don’t monetize the feed currently, but I have to guess that it’s coming. I mean, aren’t video ad dollars and the margins you get from ads against podcasts a huge factor in doing something like this? Eventually there will be interstitial or unskippable ads. This is the natural evolution of this format that we’ve seen across every other category. Is that not coming?
So first of all, if you’re a premium user, there aren’t ads in the music.
There won’t be any in the feed for a premium user?
No, not for music at all. You’re paying to not have ads there.
The thing we’re optimized for there is long-term retention. Just to stay on that point for one more second, we actually published some papers on this around machine learning. The traditional way that we recommend a song to you is impression to stream. You impress something and see if they clicked it. It’s what everyone did. It’s what we did. We published some papers where we’re starting to optimize for the impression and how much you listened down the line, like weeks later. So we have long-term engagement and retention instead of momentary.
It wasn’t clear to us if that would do anything differently. It could have been that the same item that is the most clickable in that moment is the item that creates the most retention. Turns out it’s not. In a huge percentage of cases, we start showing other things that have lower likelihood of click-through in the moment, but do create longer listening.
A theoretical example could be that in every moment, the best immediate recommendations to show you are the things that are at the top of your played songs or another episode of the podcast you’re always listening to. If you think about it, it’s probably good long-term to get you to a new show or a new Netflix series or something.
We’re trying to optimize for the long term largely because more than 90 percent of our revenue has historically been from subscription. That said, in podcasts, we do have advertising. You’re absolutely right that video advertising dollars are bigger than audio advertising dollars, which is why we’re very excited about the growth of videos. But those ads are actually in the podcasts, not in the feed.
So they won’t come to the feed?
We have no plans for that right now. What we’re optimizing for right now is to get you to see and listen to more of these podcasts.
There was an interesting stat mentioned today that 25 percent of artists on Spotify self-release, they’re not signed to any big label. There’s also a stat, I think from the 2021 Stream On, where you all said you want to be the home to 50 million creators by 2025. Is this a discovery mechanism for those people, for creators and for the indie artists, as a way to be on the homepage of Spotify in a way that they couldn’t have been previously?
There are a few factors here. New artists can be both on big labels, on indies, or self-published. The general answer is that we want to give more creators the opportunity to be discovered. We want that. And that’s the natural evolution of the industry. It happened to text, it happened to pictures, and it happened to video. It’s just going to be more creators because the fictional creation goes down. Other platforms have been traditionally better than us at breaking new artists, so we want to get better at that. We want to do more exploration of content that doesn’t have a lot of listeners.
There are two ways to do recommendations. One is to do a similarity recommendation, something that is very close to what you’re already listening to. It’s a safe recommendation, and you can do that by looking at past play data. If something is already popular and you haven’t heard it, it’s a good idea to recommend that to you, because it has a lot of proof of work.
There is another type of recommendation, which is to try to recommend something that doesn’t have a lot of listening yet. That’s where I think we can get a lot better than we have been. Part of that is the whole innovation in machine learning, large language models, and so forth, that can actually understand content before a lot of listeners listen to it. Previously, we had to wait. Content had to break organically before we felt that it was safe to amplify it. Now with these innovations in large language models, you can machine listen to the content and also safely classify it.
The big reason why we haven’t been that aggressive on what is traditionally called “explore-exploit” is because we didn’t feel comfortable exploring things that we didn’t understand. We’ve seen other companies end up in bad places, so we just held off. The solution to doing that safely until a year ago was to have 20, 40, 50,000 moderators somewhere in the world. You can do it differently now with machine learning. We’re going to try to be a lot better at helping more creators of music, podcasts, and audiobooks find an audience because of this technology. We’re quite excited about that. We haven’t been traditionally very good at it, but we’re going to get better.
You mentioned competitors have been better at new artist discovery. We’re obviously talking about TikTok here, and I know ByteDance has Resso, their own streaming service that more directly competes with you all. I guess I’m curious if the TikTok discovery phenomenon is a competitive threat to you all. I would’ve thought that it would be symbiotic, because you find an artist through TikTok but go to listen to them on Spotify. You’re trying to have both of those experiences now in one app. Is that fair?
You’re using the exact right word. The way we thought about it internally is that TikTok and YouTube are actually symbiotic. Most of the foreground discovery has happened there, but fortunately for us, so far and still, we get almost all the background listening from that discovery. It trickles down to us.
So it has been and is symbiotic, but the answer is yes. We feel now is the time where you should also be able to discover on Spotify. You just have to bide your time back to being an efficient company and doing the right thing at the right time. What we’re saying now is that we still think it’s going to happen. I hope it’s going to keep happening, but it makes sense that you could actually discover music on Spotify as well, right?
We’re starting to take that leap now, but as I just said, it is a big leap. You need to invest a lot more in machine learning and in safety. There’s a lot of cost to being one of these platforms, and we decided to take that leap about a year and a half ago. What you’re seeing now is really us saying, “Well, here is the platform where new artists can break. You don’t have to already be big to get an audience here.” It was very deliberate and quite a big bet to take on.
The other thing is the reaction this is going to get from users. You have been A / B testing this in the wild for the last several weeks. I would love to hear what you’ve learned from that test and how that gave you the conviction to launch this to everyone. Anecdotally, there’s a lot of heated reaction to this. I don’t know if you pay attention to the posts on social media about the tests of this. Have you seen some of these tweets?
We do qualitative and quantitative user testing in-house. It’s different for different features, but we have done big user interface changes before on both the desktop app and the mobile app. We asked some of our peer companies, some of the really big tech companies, about their experience with it before we did it last time. I got one piece of advice. I’m like, “How do you change the entire user interface for hundreds of millions of users?” The answer was, “Don’t. Don’t even try. People are going to be very upset.”
It doesn’t really matter, even if it is better for you, if I come in and say, “Well, I’m just going to rearrange your desktop for you, where you have a bunch of habits. You’re not going to like it, but I know better than you.” No one likes that. So it is very hard to measure.
If you add a new feature, people are almost always happy, like the AI DJ, for example. No one says, “I’m disappointed with it,” because it’s additional. But when you change a habit, it is very hard. You have to separate your expected user reactions to changing a habit, which no one likes, including myself, even though it may be much better for you down the line. You can fool yourself by saying, “No, people are upset, but it’s just because they don’t like change.” Maybe it’s actually bad. So you need to do quantitative testing to understand if it actually creates more discovery and more listening. That gives you some confidence for what are tough reactions.
I also find that for one reason or another, right now in 2023, people are very skeptical of new things. I saw this with the AI DJ. When we announced it, I saw a lot of tweets like, “Oh, this is going to be crap for this and that reason.” Then as we rolled it out, that started shifting and people were saying it’s amazing. For one reason or another, people are quite skeptical before they try it.
Very. I pulled a few random tweets and they’re like, “I’ll sail my laptop and phone out the window if they TikTok-ify Spotify.” “Can we stop making functional apps worse?” “I hope the dude who came up with this idea is thrown into a lion pit.” Then one was just, “I hate the future. I hate the future. I hate the future,” over and over and over.
“The good thing is we don’t have lion pits in Sweden.”
The good thing is we don’t have lion pits in Sweden.
That’s right. You mentioned the testing, and I have to imagine that even before the testing, this was a controversial idea to get through internally. It probably required a lot of debate and there were probably people inside Spotify who thought this was potentially a bad direction for the company. Can you walk me through the internal debate? What did you guys see in the early testing that gave you the conviction to make such a huge change when you are getting the reactions I just read?
It’s a few things. So first of all, we actually haven’t had the internal problem. I think internally, people are very excited, because they’re using the product. People are very excited about finding new podcasts, finding new audiobooks, and finding new music. One of the criticisms we do see on Twitter is something that says, “Spotify is amazing and fantastic, but I’m a little bit trapped in my taste bubble. I want help to break out.” As I said, we haven’t been very good at that, and it’s not because we didn’t know that we wanted to. It’s because it requires a different user interface.
“You need a different user interface where you can evaluate things more quickly and where you can be wrong.”
For example, I love reggaeton, which is not traditional in Sweden. If we were to introduce more Swedes to reggaeton, it’s hard to throw that into your existing playlist because it has nothing to do with it. You need a different user interface where you can evaluate things more quickly and where you can be wrong. In a feed like this, you can say, “Nope, nope, nope.” We can’t be wrong that much if we insert the song into a playlist while you’re out running. Three duds and you’re going to be like, “This is crap.” So it requires a different user interface to be able to let you explore new music genres. It just needs to be more efficient.
This is part of the reason why we’re doing this is to address the concern that it needs to be easier to expand your taste and try new things. We thought about that quite hard, and it’s not an algorithmic problem. The algorithms were there, but it’s a user interface problem. So we haven’t had the internal problems around that.
One thing that we do is we try to reason quite a lot through first principles. For example, what is the real user problem? It’s that people get trapped in their taste bubbles. We’re really good at having your taste profile and saying, “There’s an artist in that taste that you missed.” We’re not that good at saying, “Maybe you’d like a bit of reggaeton, which you’ve never listened to according to our data.”
That’s the problem. Then you backtrack. What do we need to solve that problem? Is it an algorithmic challenge? No. To explore-exploit and try new things has been around forever. It is partially a safety problem and it is partially a UI problem. That’s why we built this.
The other first principles approach we’ve had is, if you look at the world and where it’s going, it seems that everything is going to be about machine learning and personalization. As I said initially, I think if you’re going to be competitive, you need to have a user interface that can better understand if the user actually liked what you showed them.
These efficient things that show tons and tons of small cover arts that don’t actually have anything to do with the audio behind it are just incredibly inefficient. You’re going to annoy the user by showing the same podcast cover art 50 times until they say, “Yeah, I saw it. I don’t even know what it means.” So from first principles, we actually don’t think we have a lot of choice. I think companies that don’t have an efficient user interface for a machine-learning world are not going to be able to leverage machine learning. First principles thinking helps us get the conviction to do these things.
I think it’s inevitable that it’s going to be challenging at this scale. Then you do a lot of testing to make sure that it’s actually good, and you’re humble, listen to feedback, and try to understand what the challenges are.
One of the challenges that we foresee and try to address is that one difference we have from many of these other platforms is that they are only about new content. You actually never expect to see the same thing. This is very different from music. Eighty percent of the use case is to go back to the same playlist, the same songs, or the same podcast. Music, specifically, has an enormous amount of repeat listening. One problem could be to take a user interface made for 100 percent new discoveries and apply it to what is only 10 percent, 20 percent new discoveries.
If you look at the actual user interface, almost the entire first screen is what we call shortcuts, which is only your favorites. One of the challenges we have that most apps don’t have — maybe YouTube has it, but certainly not TikTok or Instagram — is that on Spotify, you could be in the middle of at least five things. You could be in the middle of three podcasts, one audiobook, and a playlist. How do you keep state of all of those things? If you look at the app, the home feed and all the sub feeds start with this podcast section, where you can see if you’re in the middle of this episode. We see this metrics-wise. What are the things that you need to have absolutely front and center? Nothing starts auto-playing when you open Spotify. You have to actively scroll to start discovering.
If you for some reason say, “I hate discovering new music. I don’t want to try new music, and I don’t want to try new podcasts.” You can open Spotify, and it’s going to be exactly where it was. The shortcuts are on top and all your relevant sessions are even closer. Before, you had to go to the library and scroll a few times to continue your session. We try to make sure that it’s easier than ever to keep playing your favorites, because that is our main use case.
Then when you get bored, discovery should be one swipe away. You shouldn’t have to click from cover art to a playlist page to a song and then swipe to the hook to evaluate a song. This is how we’re trying to address these, because they are valid concerns. The fears are valid. Our job as product people is to listen and try to alleviate them.
At Amazon, they have this framework for decisions that can be divided into two categories, it’s a one-way or a two-way door. Is this a one-way? This sounds like a one-way, like there’s no coming back from this. It sounds like you guys have decided this is what the app needs, where the app needs to go for the business.
I don’t think it’s a two-way door to say, “No, what we did in 2008 was the end of history. That was the final innovation. We figured it out.” I don’t think that’s going to happen to any company. This is part of the reason why companies fail. You did something good, it worked, and then the world moves on. Then for one reason or another, it’s often someone else that figures out the new thing. You have this traditional challenge of seeing this new thing happening, and technology evolves. It should happen. How do you make sure that you don’t sit there and just watch someone else invent the future instead?
I don’t think it’s a two-way door for anyone to stay out of innovating for a machine-learning world. Like I said, the shortcut section on top is exactly the same that we have. We can choose to have more of those items if we see that people feel like, “No, now it’s too hard to get to my favorite song.” We just increase the amount of shortcuts until it’s easier than ever. So it’s very adjustable. The notion that we need to get better at discovery and you should discover audio by actually hearing the audio instead of looking at pictures and reading text, that I think we have very high conviction around.
We’ve talked a lot about audio. There were a lot of podcasting announcements today, and podcasts were shown prominently in the feed video as well. This was kind of a renewed push into video for you all. Is there a concern there at all? Were you guys thinking about how the new homepage could optimize for video as well? Could this put audio-only creators and podcasters at a disadvantage if Spotify for some reason says, “We want video because it has higher engagement and could potentially lead to more advertising”? How do you balance the formats there? I think there will be a natural concern that Spotify is showing off all this new video podcasting stuff. Does that mean if I’m an audio podcaster, I’m not going to get the same level of distribution?
If you look at the announcement today, we show a lot of video because it is something that is happening right now. It’s something we’re excited about, and we talked about it a lot today, but it is still the case that the majority of podcasting is audio. I think there are distinct formats of podcasting, which are supposed to be audio. Then there are some forms of podcasting that benefit greatly from having video because it’s a face-to-face discussion.
Like you mentioned, we’ve done previous pushes into video, which were more from us wanting to get there. That actually didn’t work, so we kind of backed out of it. Then what happened was we started getting podcasts with video onto Spotify, Joe Rogan for example. We were kind of like, “Okay, let’s enable it then I guess, but we know video isn’t really our thing.”
We looked at the metrics and saw that people foregrounded this thing when interesting things were said. They start looking at smoking weed and these things. Back to listening to users, we’re like, “Okay, if they want it and they use it, let’s do it.” As Alex Cooper said today, we do see episodes with video get more engagement, but that doesn’t mean it’s for everyone or that everyone has to do it. For us, we kind of recommend what people want to listen to, and we try to understand what’s performing the best. From an advertising point of view, we sell audiences with video and audio.
I don’t think it’s going to change anything for audio creators, because audio is a unique format. Like I said about the foreground feeds, this is actually for those rather short moments for us in percentages. We have quite small amounts of foreground time and huge amounts of background time, the opposite of our competitors. We like it that way. It’s better to complement your competitors than be a direct replacement.
So you’re not trying to optimize for foreground activity?
No. We’re trying to use the foreground time to get you to find much more background content. Then what we see with video is that people actually foreground the podcast videos while they’re listening to see stuff that happens. It seems to improve the listening experience. I think people get stuck on the video, but my favorite example is that there are different types of video. You have the long-form conversation, like Lex Fridman or something, with video. It’s interesting sometimes to see what’s happening with video, but most of the time, watching the talking heads is not important. But that video makes a lot of sense as background audio.
“I don’t think it’s video versus audio. I think it’s conversations versus whatever the other is.”
Then you have something else. Like, I’m passionate about Brazilian jiu jitsu. Most of those videos don’t even have commentary. If you watch them as audio, it would just be people panting. It makes no sense. I don’t think it’s video versus audio. I think it’s conversations versus whatever the other is. So conversations seem to benefit from video sometimes, but it is a voluntary thing. We’re not in the business of trying to get foreground video on the platform, if that makes sense, but it’s about the background video.
The podcast auto-play feature you announced, where you’re going to start auto-playing podcasts after one ends, are you pulling that from a listener’s other shows or are you suggesting new shows to them?
We’re also suggesting. It is an opportunity to help you find and expand, but it pulls very heavily statistically. What I said in the presentation is the most relevant. Most of the time, you will probably hear the episode you would’ve clicked. That’s what we’re trying to optimize for. It’s like, “What would you have clicked?” If you have an unlistened episode from a podcast that was listened to, that is the best thing to play. But we could potentially suggest things to you.
Last question on podcasting and then I have one quick one to end on. Are you ever going to make it impossible for me to skip ads in podcasts?
Am I going to make it impossible for you to skip ads?
No. That’s not something that we have planned for at all. In fact, what I think we see in the industry has been the opposite. Ads get more and more skippable because it’s a signal for the relevance of the ads. So no, it’s not anything that I have planned for you.
Okay. Getting out of podcasting, I wanted to ask you about HiFi. I would say it’s actually unusual for a company to announce something and then just ignore it for two years. What happened to this premium HiFi tier that you all talked about in 2021?
We announced it, but then the industry changed for a bunch of reasons. We are going to do it, but we’re going to do it in a way where it makes sense for us and for our listeners. The industry changed and we had to adapt.
How did the industry change?
In various ways. We can go into more details later, but it changed.
I mean, be honest. Was it because it was cost-prohibitive from a margin perspective with the labels? Was that the reason?
We want to do it in a way where it works for us from a cost perspective as well. I’m not allowed to comment on our label agreements, nor on what other players in the industry did, for obvious reasons.
Well, everyone knows that the labels charge more for these premium tiers. I mean, I feel like that is understood at this point in the industry.
Yes. It’s back to what we did today. There is a risk that you unnecessarily commoditize yourself if you just do what everyone else does, and just try to do it the cheapest or the fastest. We want to do something where we thought it through. I don’t want to say more right now.
Okay. You don’t want to give me an expected date or anything?
No. Rather than saying, “Well, we’ll see if we do it,” it’s, “No, we’re going to do it.”
You’re going to do it. Okay.
I don’t want to give it away because we want to try to do something that is our own and unique.
Are you a believer in spatial audio? Is Spotify going to do spatial?
I won’t comment on it right now. I want to save that for later.
Do you personally like spatial?
We’ll do another episode.
Okay, we’ll do another episode.
You know this. Audio quality is like the most sensitive topic.
It is, especially for Verge listeners.
You have believers on all sides of the spectrum who love and hate the same things. It is another one of those tricky things where you can’t please everyone, but we’re going to try.
Okay, so there’s a Spotify HiFi lossless-type experience coming at some point?
Something is coming at some point. Yes.
All right. We can end on a little bit of an optimistic note there, although I wish you would share more, but that’s okay. We’ll do this again.
Yes. We’ll save something for later.
All right, Gustav. Thanks so much for joining Decoder.
Thank you for having me. It was great to be here.
Decoder with Nilay Patel /
A podcast from The Verge about big ideas and other problems.