Software is eating the world “— Marc Andreessen — a case study in product management

Spotify: how software ate audio

This is the story of what has worked for Spotify in its journey of delivering the perfect listening experience and more…

Nima Torabi

--

In 2006, Daniel Ek saw the gap in the market and took advantage of the changed music consumption habits, and launched Spotify. Since then, Spotify has taken advantage of Apple’s innovations, ridden the megatrends of mobile, and used AI to deliver the “perfect listening experience”.

To address its low unit economic returns, Spotify is building an interactive audio platform, planning to take audio content curation to never before seen realms. This is the story of what has worked for Spotify in its journey of delivering the perfect listening experience and more…

Spotify: how software ate audio This is the story of what has worked for Spotify in its journey of delivering the perfect listening experience and more…
Photo by Haithem Ferdi on Unsplash

1. Napster and changing consumption habits

While Spotify officially launched in 2007, its origin goes back to 1999 and when Napster was launched. The MP3 file format was initially released in 1991 and as audio became digital files and information, it raised peoples’ desire to share these files. In 1999, 8 years after its initial launch, MP3 file-sharing was not mainstream yet, and along comes Sean Parker and Shawn Fanning with Napster, an interface that acted solely as the clearinghouse that enabled and facilitated sharing of music and other audio content as MP3 files.

While Napster did not intend to be a music piracy platform, it just happens that most of the music that people listen to and want to have access to is owned by a third party (i.e. the record label companies and the artist that produced it). In essence, the problem that Napster was solving, was the record label companies’ business models that restricted easy access to music and audio content at prices that consumers felt were fair.

The US recorded music revenues broken by format type and the influence of Napster (Source: RIAA)
The US recorded music revenues broken by format type and the influence of Napster (Source: RIAA)

The music experience in the late 1990s (before Napster)

Before Napster, consumers needed to go through these steps to listen to music:

  • Go to a physical retail store (e.g. Tower Records, Sam Goody, etc.)
  • Preview a track using the headphones provided at the location
  • Purchase the full album to listen to a single track of preference
  • CDs weren’t cheap back then, due to the monopolistic nature of the industry at the time
US recorded music revenues in CD format — figures in billions of USD (Source: RIAA)
US recorded music revenues in CD format — figures in billions of USD (Source: RIAA)
US recorded music sales volume in CD format — figures in millions of units (Source: RIAA)
US recorded music sales volume in CD format — figures in millions of units (Source: RIAA)
US recorded music unit cost of CD for end-users — figures in USD (Source: RIAA)
US recorded music unit cost of CD for end-users — figures in USD (Source: RIAA)

Simply put, exploring and discovering music was painful and expensive, and consumers weren’t encouraged to explore or broaden their music consumption under those market structures, where music label companies were capturing a large share of the industry royalties.

With the prevalence of MP3 files and growing internet access, Napster took all of the friction out of the process of searching for music tracks, finding users who had them, and downloading them from their computers. In effect, Napster had created a virtual library of content in a decentralized peer-to-peer cloud by solving content discovery and delivery problems.

Consumers were quite receptive to Napster’s virtual library of content in a decentralized peer-to-peer cloud — Image: Napster’s home-based users — US — figures in thousands (000)
Consumers were quite receptive to Napster's virtual library of content in a decentralized peer-to-peer cloud — Image: Napster’s home-based users — US — figures in thousands (000) (Source: link)

The Metallica vs. Napster lawsuit

In early 2000, Metallica discovered that “I Disappear”, a work-in-progress track, had leaked through Napster, and subsequently several radio stations in America were playing the song. Metallica felt violated and in April 2000, officially filed charges accusing Napster of “copyright infringements, unlawful use of digital audio interface devices, and violations of the Racketeering Influenced & Corrupt Organizations Act.” Furthermore, Metallica sought $100,000 per illegally downloaded song in damages and provided a complete list of the 335,435 Napster users who had illegally downloaded Metallica’s songs.

In response, Napster banned the users that had shared and downloaded the song with a pop-up window that said, Banned by Metallica, which was a brand deterioration for Metallica who had prided itself on being fan-friendly for decades, becoming a symbol of “corporate greed” overnight. Overnight, while Metallica had made the case about Napster, Napster had made it about Metallica and its fans. And this became a national conversation in the US about the future of music with Lars Ulrich testifying in front of the US Senate Judiciary Committee in July 2000.

The Future of Digital Music (Napster vs. Metallica) Senate hearing (Source: CSPAN / YouTube)

Eventually, in 2001, 2 years after Napster’s launch, a circuit court in California ruled in favor of Metallica and issued an injunction against Napster to delete every single Metallica track from its users’ libraries, a task that was, by definition, impossible on a peer-to-peer network. Instead, Napster voluntarily ended service and filed for bankruptcy.

However, the genie was already out of the bottle, and the consumers knew what they wanted, with Napster showing them the way. In a world where businesses don’t provide consumers with the value proposition that they want, with the right product, through the right distribution channels, and in an optimized user experience fashion, they will eventually lose.

Meanwhile in Sweden (Spotify’s home)

By the mid-2000s, Napster was gone but Sweden was a music piracy heaven. Even, one of Sweden’s biggest evening papers had a guide on how to download music for free. Consequentially, recorded music revenues in Sweden dropped by 63% between 2000 and 2008.

Recorded music revenues in Sweden from 1969 to 2012 — figures in Millions of SEK and adjusted for CPI in Jan 2013 (Source: link)
Recorded music revenues in Sweden from 1969 to 2012 — figures in Millions of SEK and adjusted for CPI in Jan 2013 (Source: link)

There were two major drivers of music piracy in Sweden:

Broadband subscriptions per 100 people, 1998 to 2019 — Sweden vs. the US vs global average (source: link)
  1. Internet access: government-funded PC programs helped families purchase subsidized PCs and computers and therefore the kids of the day (millennials of today) were using computers extensively for programming and entertainment purposes, and many had become hackers. Furthermore, the government had laid miles and miles of cable bringing high-bandwidth, low-latency internet to people across the country. These investments had made Sweden not only an early tech hub for Internet companies but also the world’s hotspot for piracy.
  2. Consumer attitudes: the general public had a philosophy that it had a democratic right to have access to all sorts of digital information and/or content, and in response, politicians aimed to make access to information and content available to anyone. For example, Pirate Party, an entire political party dedicated to “liberating” content got started in Sweden.

And this made life quite difficult for new streaming startups in Sweden, as they would have to try to introduce a whole new user behavior and habits and figure out how to sell regular people, music pirates, and CD buyers alike, the radical idea that they didn’t need to own music anymore to enjoy it.

Today, streaming (i.e. access to content vs. ownership) is so ubiquitous that it’s almost impossible to explain the world before it, but in the 2000s, the ability to play music on demand without owning it, was hard for people to wrap their heads around, no matter how they got their music, legally or not.

It took music streaming a long time to gain consumer acceptance and truly take off — Source: Statista

Competitive forces in the media industry

Access, content, and platform shape the pillars for strategic innovation and competition in the media industry
Access, content, and platform shape the pillars for strategic innovation and competition in the media industry

Three pillars drive the cyclicality of competition in the media and entertainment services industry:

  1. Distribution and access: generally kicked off by innovation in how content is delivered to the consumer and priced. This means that media companies enable access to content in a way that wasn’t previously possible
  2. Content differentiation: following innovations in distribution as the market players catch up, there’s always a substantial shift in content with competition shifting towards creating unique content
  3. Platform strategy: as media companies innovate their distribution and content strategies, they become more than just a business that delivers a show, a title, or a product, rather something more expansive and engaging, even allowing users to help the business create content

Whenever a change in access occurs, media companies’ leadership positioning changes, and with a change in business models, how they monetize changes, and consequently the market grows or contracts according to the new business model which is the norm.

In general, the greater the distribution innovation, the greater the change in monetization and content differentiation, and despite some potential short-term downturns, the greater the growth as well — that’s just how the current global financial structure is modeled: rewarding human disruptions with exponential compounding monetary rewards.

Global recorded music industry revenues, 2001–2020, figures in billions of USD — the rise of streaming due to MP3 penetration and decline of physical assets for music distribution
Global recorded music industry revenues, 2001–2020, figures in billions of USD (Source: IFPI) — the rise of streaming due to MP3 penetration and decline of physical assets for music distribution

In the case of the music industry, before MP3s and streaming, the marketplace had operated in more or less the same fashion with CDs replacing cassette tapes that had replaced vinyl for a very long time. Now that music was a file, it was infinitely copyable and could get distributed quickly, a huge distribution innovation that led to the rise of streaming services in the following decade.

The state of the industry in the mid-2000s

In summary, with the advent of MP3 and piracy, consumers’ access to music had changed forever, however, there was no business model to support it. This is where streaming — and Spotify — comes into the picture.

2. Bold ambitions: launch of the desktop client

The story of Steve Jobs and iTunes

At the height of the tensions between music label companies and artists and the file-sharing networks, Steve Jobs understood that the rights holders' shortcomings were a lack of grasp of the internet and technology.

While ‘music’ was not part of Apple’s portfolio of active industries, as music became digital in the late 1990s, iTunes was launched to help consumers discover, store, organize, and play digital music, making it simple for listeners to rip CDs onto their Mac laptops, making it easy to consume and share music without paying for it. And a few months after the introduction of iTunes, Steve Jobs introduced the iPod, completing the software and hardware requirements to enter the music industry in the new digital age, somewhat ‘legally’.

Rip. Mix. Burn. iTunes Commercial, Ad by Apple, 2001

As iTunes and iPod gained market share, Steve Jobs wanted the rights holders to sell their content through Apple’s digital stores. At first, the record label producers thought that they could do it themselves without the presence of a technology intermediary. But their efforts failed for three major reasons:

  • Lack of technological know-how to develop amazing digital experiences
  • Content Inability to work together and share content on a single marketplace to ease discovery
  • The pressured push to sell albums vs. singles that consumers at the time were not willing to pay for

In mid-2003 Steve Jobs introduced the iTunes music store. Apple sold 1 million single tracks in the first week and 10 million in the first month and later in the year as iTunes was introduced to Microsoft, legally downloaded digital music took off. By the end of 2004, Apple had sold 70 million single tracks.

While Apple did not make much money selling singles (30 cents per sale), it made its profits by selling iPods. And Apple held this monopoly over legal digital music distribution for quite some time, which was another headache for music label producers to think about after Napster. Furthermore, iTunes still hadn’t created much value for consumers at the time — imagine wanting to build your current music playlist at 99 cents per track purchase cost on iTunes — it would add up to a large sum for a lot of consumers today.

Global download digital music revenues, 2001–2020, figures in billions of USD (Sources: IFPI & CNN)

Enter Daneil Ek and Martin Lorentzon

In 2006, Daniel Ek and Martin Lorentzon set out to disrupt the music industry, with the core belief that they could make a product that had a better experience than music piracy. Even though music piracy was accessible and free, it was also tedious, slow, and difficult to use.

Consumer challenges with music piracy

For those who remember the Napster, Kazaa, and BitTorrent days, the consumer journey with music piracy had challenges that fell short in meeting consumers’ listening needs across discovery, latency problems, and overall reliability:

  • Discovery: first, one needed to go to a pirate site, search for a song, and then download it, which could take many minutes or even hours as it needed peers to be online to start seeding files
  • Listening experience: once it was on your hard drive, it could still take several seconds for the music to start playing because of slow-spinning hard drives at the time
  • Quality assurance: after downloading tracks, one could discover that it was named incorrectly and was a different track, or was poor in audio quality, or it may have been a virus in disguise

The initial solution

Daniel Ek and Martin hypothesized that with the existence of music piracy, the mass market will not be willing to pay for listening to music at the time, but if the product was good enough, they may be willing to listen to ads, just as they did on the radio, which could generate enough revenue to pay music label companies and other rights holders.

The founder's initial hypothesis was to find a way to give listeners “the perfect listening session” — all for free, on a product where consumers search for any song ever recorded and press play, and it does so without any delays. At that time, this was the combination of the concept of discovering and downloading all the world’s music, similar to P2P download platforms for free or at a very low price, by listening to or watching ads, very similar to linear broadcast TV and/or Radio), and you marry it with the user experience of iTunes and iPods so that it would feel like the user had all the world’s music on their hard drive.

Embracing peer-to-peer technology to scale

In 2006, most users were still on dial-up and broadband was quite expensive, and simple websites would take 10–20 seconds to load on desktops. Spotify set out to experiment with a client-server infrastructure model, but it just fell short of the desired listening experience and was scraped.

Spotify realized that to succeed, it needed to use peer-to-peer technology and parallel computing to deliver the “perfect listening experience” to its users — and that’s when it recruited Ludde Strigeus, founder of uTorrent, the world’s second most popular BitTorrent client ever, with 150 million users around the world.

By 2006, peer-to-peer networking had evolved from Napster’s model of sharing MP3s between two people, into torrents. Torrents have one key advantage over earlier forms of piracy:

  • Instead of connecting a peer directly to another user, torrents break up files into tiny little pieces, and one simultaneously downloads many fractions of the same file from other users who are close to them on the network, delivering a shorter download period

Ludde realized early on that the browser wasn’t an alternative to deliver the desired experience. Back then, most of the web was made up of browser-based — “thin clients” — that included web pages or flash-based clients that ran on internet browsers. While there were a few downloaded “fat clients” such as Skype and various messengers including AOL Instant or Yahoo!, still, at the time of Web 1.0, there was very little interactivity in the browser. And if Spotify were to live in the browser, it could only run as slowly as the rest of the browser-based internet where most of the music piracy websites were — the competition.

To play music instantly over the internet or stream, Spotify ignored all the standards at the time and developed its unique protocols and end-to-end streaming infrastructure — both on the server and client sides. Spotify’s peer-to-peer desktop client network was inspired by BitTorrent which used peers for non-latency critical parts and Spotify’s servers for latency reduction mechanism. In action, this meant that when you started playing a song, the first 30 seconds were downloaded directly from the Spotify servers to the client cache, to give audiences the feeling of a frictionless listening experience, and the rest would be downloaded from the peer network to help Spotify scale. In the meantime, that MP3 would get encoded and stored in the local cache of the listener for future listening. Additionally, using peer-to-peer technology helped Spotify save bandwidth costs which made it economic and attractive for investors to back the company.

Custom made protocols

To increase content delivery speed, Spotify built custom protocols that were much more efficient compared to HTTPS and TCP which had start delays. Spotify’s desktop protocol codes in the client and backend were designed to keep the latency down giving it control over audiences’ listening experience.

Building a proprietary full-stack content distribution solution with its streaming protocols allowed Spotify to combine the scalability of peer-to-peer with the speed of client-server technology and have control over optimizing the solution moving forward. For example, with the new protocols, Spotify predicted what the listener would listen to next and downloaded them beforehand. Spotify also used the open-source OGG Vorbis codec to improve latency challenges and overall music quality.

To beat the competition, i.e. music piracy, rather than use off-the-shelf codes and libraries, Spotify took completely the opposite approach of building proprietary protocols and full end-to-end distribution solutions, and a downloadable client instead of a thin browser-based client to provide a better listening experience to listeners. And this needed to be done to detach customers from a free piracy network and form new listening habits. Spotify needed to exceed expectations if it were to find product-market fit.

Spotify’s ambition meant attracting better talent

As Spotify set out to deliver on its ambition of delivering a great listening experience to users through peer-to-peer technology and proprietary protocols, it naturally became a talent magnet for ambitious and skilled individuals.

Highly ambitious and challenging goals meant the attraction of more ambitious and talented staff and personnel, and this virtuous cycle has helped Spotify overcome many challenging obstacles over the last 15–16 years.

Pitching the solution to music rights holders

Before it could launch, Spotify had to make deals with the labels and artists — an industry that was afraid peer-to-peer technology would decimate their livelihoods and suspicious of tech companies and their intentions. Music label companies had lost 50% of their revenues and were reducing staff every quarter in 2008, but were also excited about the potential out there and willing to hear pitches that made sense to them.

Daniel Ek of Spotify talks about getting the company’s first music licensing deals — This Week in Startups

Between Spotify’s amazing desktop client and the understandable ad-supported business model, the Swedish record labels decided they had nothing to lose and signed deals with Spotify one after the other. After two years of development, Spotify launched in Sweden in 2008, and was an instant success because it was intuitive to work with, faster, and overall a much better product than music piracy.

An amazing solution, but also expensive

While this story is an amazing entrepreneurial endeavor in solving a challenging problem, it took the founders 2 years before they had a finely tuned desktop client to launch publicly with a hypothetical business model that made money from ads.

Before the launch, Daniel Ek had founded a company called Advertigo, which Martin (the other co-founder) acquired, and together took one of Europe’s largest ads businesses — Tradedoubler — public. This put them in the enviable position of having time on their hands, and capital to invest.

In the case of Daniel Ek and Martin, stars had aligned and along with timing and luck, they were able to deliver Spotify to disrupt the music industry. However, I think this ambitious project could have been a bit of a stretch at the time for new founders that would have wanted to raise capital from scratch.

Spotify’s funding timeline — nothing until 2008 and after the initial desktop client launch
Spotify’s funding timeline — nothing until 2008 and after the initial desktop client launch — Source: DealRoom

3. Going mobile

With the launch of the Spotify desktop app in 2008 in Sweden, stakeholders were seeing positive returns in the digital music space:

  • Record labels and musicians were earning money
  • Listeners were happy because the desktop app allowed them to listen to any song (i.e. easy discovery and selection), for free, with a faster and better experience compared to piracy
  • Spotify had captured some parts of the demand and was making money from an ad-supported business model

However, Spotify had only become a small part of the listeners’ music listening journey — the discovery and selection part. The average consumer was using Spotify’s desktop app to find and share playlists but still using, for example, PirateBay to download songs as MP3s and manually move it to iTunes and iPods to listen offline — this was where the vast majority of listening was happening.

Steve Jobs Introducing The iPhone At MacWorld 2007

Spotify could not do much about the hardware part of listening (i.e. the iPod) until 2008 when Apple launched iPhone. With the launch of the iPhone, the iPod went from a separate mobile piece of hardware to a piece of software that lived on smartphones. This meant that Spotify could finally offer its music software on mobile devices where all the listening happened.

The problem with going mobile in 2008

Spotify’s value proposition of playing any song faster than piracy, for free, in exchange for listening to ads was a challenge to achieve in an offline mobile app, and because you can’t click on ads when you’re offline, Spotify needed to be a paid product. Which was quite counterintuitive to Spotify’s core value proposition and customers at the time.

Although a high proportion of the population uses the Internet for information and communication, online content providers still struggle with the question of how to manage their product in a more profitable way. Even though different payment models have been proposed by researchers, it is still very difficult to overcome the “content for free” mentality and to introduce paid content models to the Internet [Thies and Albers 2010] Source

Despite skepticism, Spotify hypothesized that consumers could be willing to pay if the right offering is provided. User research, as per the demonstration of the quote above, is great for understanding what people think they will do, but not what they will do. User research should be taken at the face value as it is: people trying to predict their future reactions, confined to information they currently possess. However, if a business has a strong hypothesis on how customers will react differently in action, upon seeing and using a new product, including being willing to pay for a convenient music streaming service, then the team should follow that hypothesis and prove it right or wrong.

Furthermore, to add to the challenge of asking for consumers to pay, the records and label rights holders were “charging for value”. Meaning that since mobile listening is very valuable to consumers, labels require a different license that they charge a premium for the same music Spotify provided on its desktop app. Having seen what iTunes did in migrating CD and to digital MP3 listening, labels viewed the portability aspect of digital music as a premium that consumers should be willing to pay for.

Spotify’s initial agreement with label producers entailed $10/month for up to 30,000 encrypted tracks to listen to offline before the user had to reconnect and confirm that they were still a paying user, or the encryption key would expire, and the files would become unplayable.

Initial concerns

Spotify’s initial surveys indicated that no one was willing to pay $10/month to listen to music on their smartphones and offline as they already had pirated music on their devices. Therefore, Spotify needed to change its value proposition to capture customers’ willingness to pay.

Since consumers had access to free pirated music, the hypothesis was that they wouldn’t be willing to pay for access but a new convenience that would entail a more holistic experience that provided ease of use and all kinds of other new benefits were needed.

Furthermore, Spotify also hypothesized that current users who invest hundreds and thousands of hours listening to content on the desktop version of the product will form habits that would be open to the Spotify experience on mobile also.

The solution

When building a new habit, businesses need to start with a minimum viable product, learn fast from users’ consumption behaviors, and iterate. For this to work you can get away with scrappy builds to get people hooked.

However, it gets complicated when a product aims to change an existing habit. Then the value it aims to offer needs to be 10 times better (a hypothetical number 😃), especially if that existing habit is currently free and you aim to charge for it going forward.

Spotify’s mobile MVP app aimed to keep the users’ devices in sync with the desktop version, by automatically downloading encrypted copies of playlists over Wi-Fi, so that they were available on smartphones when people were mobile. This meant no more manually syncing playlists (i.e. iTunes) and USB cables, all at the cost of 2 cappuccinos per month (less than $10 at the time).

Developer Sandbox Interviews: Spotify — by Google Developers

Development challenges at the time

With Spotify’s peer-to-peer network connectivity needs, smartphones of 2008 lacked the power supply and constant high bandwidth connectivity requirements. Furthermore, mobile data was not cheap in 2008–9 and consumers were price-sensitive and continuously monitored their data consumption rates.

Therefore the initial Spotify mobile applications monitored all the network connections on the phone and as soon as they noticed that there was a Wi-Fi connection available, the app would force switch both audio streaming and caching to stop 3G or Edge and switch to Wi-Fi. In essence, the early Spotify mobile apps did not have peer-to-peer connectivity for power reservation, bandwidth, and cost-effectiveness reasons.

Adding to the complexity, the mobile operating systems of the day, would kill applications unless they were in front of the user as an active application. To keep the play and download going, Spotify would play a silent stream to not get killed by the operating system while the app was in the background.

While the desktop version used the Ogg Vorbis codec, due to the resource intensiveness of the content quality on smartphones, Spotify used the Ogg Tremor codec that relied on much cheaper fixed-point operations to help maintain battery health while maintaining quality listening experiences. It wasn’t until a few years later that Spotify switched its music codec to ACC when the codec was licensed and paid for on smartphones.

Spotify mobile demo for Google Android by SpotifyDemo

Business model innovation

While Spotify’s team of engineers developed technologies to distribute content and deliver intuitive, fast, and clean listening experiences, the partnerships team was amassing unique licenses to deliver arrays of music tracks and content to audiences.

While other competitors either already had or could have built Spotify’s mobile distribution technologies and experiences at the time, none had Spotify’s licenses — which was truly unique and Spotify’s differentiating factor on mobile.

Spotify’s long-term vision of disrupting and capturing the mobile market and its focus on delivering quality experiences to audiences, helped it negotiate strategic terms with label rights holders. The new access model and subscription had scared the music rights holders a little bit because they knew that this meant they no longer owned and controlled the distribution channel, which was at odds with their traditionally monopolistic market powers.

While Spotify got rejected from many negotiations, it kept pushing the door for licenses with all major labels and signed customized contracts with detailed terms with each one of the companies. This meant that it took Spotify substantially longer to pilot new ideas, but the overall customer experience was never jeopardized.

Successful launch in 2009

With the licensing deals and mobile tech in place, Spotify’s paid mobile app launched in 2009, and within a few weeks a small percentage of Spotify’s loyal user base converted to paying users and in a few months, more than 20% of Spotify’s users were paying for the mobile app. This was at a time when the benchmark for freemium models at the time was Skype, which only converted about 7% of its users into paying at least $1 for some of its services.

By 2009, three years after its initial idea conception, Spotify had transitioned from a startup that was growing users fast while relying on venture capital to a business that was quickly generating revenues with a validated business model, all thanks to going mobile to gain a share of listener time spent on hardware devices such as iPod.

Spotify for iPhone preview by SpotifyDemo

4. The world goes mobile, and so does Spotify

Global PC and Smartphone shipments (i.e. volume) 2006–2021 (Source: Gartner)
Global PC and Smartphone shipments (i.e. volume) 2006–2021 (Source: Gartner)

In 2011, smartphone shipments exceeded PC, and there were some drivers and indications of this happening years before:

  • The launch of the App Store in 2008: the app store ecosystem created a marketplace for developers, small teams, side hustlers, and startups to easily and profitably distribute their apps and services. The app store ecosystem drove massive innovation that led to a positive loop of supply and demand-side network effects that delivered more and more products and services that people were willing to pay for
Global smartphone operating systems market share by volume 2009–2022 (Source: gs.statcounter)
  • OS migration: Apple iOS and Google Android smartphone operating systems went from roughly zero to ~35% market share in 2010, a huge target to hit within only 4 years, mainly due to their app development ecosystems. In action, while it was easy to build and upload apps on these marketplaces, the scale of differentiated content delivered something for everyone to consume
  • Internet access: growing global access to Wi-Fi, reduced connectivity costs, and ease of staying connected 24/7 at the palm of one's hand, were driving high smartphone adoption rates. Only a few years before could desktop connectivity, mainly at the workplace where high-speed internet could be afforded, trump the communication, productivity, and connectivity experience of a smartphone

This was all while Spotify’s customer acquisition and adoption strategies were built on the popularity of its free desktop product that synced somewhat seamlessly on the mobile app. As consumers leapfrogged PCs to smartphones, they couldn’t build libraries on Spotify’s desktop version to sync on their mobiles, and a mobile app that had a paywall before users could consume content was an existential problem for Spotify. Therefore, a business model innovation was needed.

Aligning with the global megatrend

In 2010, Spotify realized that it had to replace its desktop-first and mobile-paid-only business model with a mobile-first freemium one, otherwise, it would be replaced by other players. And the challenge was a big one — Spotify had to build a listening experience that would get listeners, old and new, to adopt the new mobile-first freemium product, engage, and find a reason to stick around, while convincing and keeping rights holders happy with a good deal.

Initial hypothesis: the more you play, the more you pay

Spotify named its desktop-to-mobile conversion strategy “the more you play, the more you pay”, but didn’t know exactly how much audiences had to play, before starting to pay and under what circumstances. But what it did have was a large base of customers with their data insights to experiment with and run A/B tests.

The major challenge was that this business model pivot was happening at a time when Spotify was still VC-backed and pre-IPO. Any development that showed a lack of user growth, negative press, and/or right holders' worries could lead to Spotify’s demise.

Spotify decided to go big and comprehensive. Spotify tested as many different scenarios of its freemium mobile app as possible — the goal was not to go deep on one single idea but to cover as much area as possible. This meant that Spotify tested ideas that may have even been bad ones, like capping the amount of time people could use the app for free, which opposed the hypothesis of “the more you play, the more you pay”. Experiments and A/B tests that Spotify ran varied including:

  • Time limits: some samples included listening to all you want for free, unlimited listening with ads, and capped to only several hours
  • Track limits: ranging from full access to the playlists and libraries to only having access to a few favorite tracks
  • Location limits: ranging from full mobility access to only free access at home to replicate the desktop experience
  • Interactivity limits: ranging from the full ability to move to the next track, rewind, fast forward, scroll back and forth to having none of these abilities
  • Country tests: Spotify, then available in several countries, ran A/B tests of the various product versions in different countries due to differing right-holder agreements and cultural consumption patterns. Even today, a user’s experience of Spotify in the UK can be quite different than that in India

Insights

As Spotify started to experiment with the many versions of its freemium service, it realized that it had to have them run a long time before the data could show how users behave as human habits take time to shape, which became a very expensive endeavor. Consumers need to go through the experience and find reasons to convince themselves that music is way more valuable than it used to be and that they should pay for it.

Imagine the various product experience versions, the number of cohorts that needed to be tested, the amount of time it took for new habits to form, the various data retention curves that various versions of the A/B tests generated, and the amount of time needed to be invested to understand the sticking consumption trends and patterns. Today, Spotify knows that certain user profiles take +9 months of free listening to pay for the premium service, and it had to run these long experiments to reach this understanding.

Some of Spotify’s learnings included:

  • All limitations were detractors for all of its users, consistently, across the board, with various degrees of repulsion based on the audience profiles. However, this was a no-go with the rights holders
  • Track and playlist limitations were a major barrier to adoption. Consumer listening patterns mainly involve finding tracks they like and then repeatedly listening to them for some time before moving on, and this limitation was a major barrier to user adoption and conversion
  • “The more you listen, the more you will pay” was confirmed. Customers that had full access to the full product experience, organically, turned into paying premium users, within 3–8 months

With the vast insights gained, Spotify had a world of variations and iterations at its disposal to target audiences. Spotify had a product that could promise listeners that they could use it 24/7/365, anywhere, for the rest of their lives without ever having to pay, and personalized based on the customer profile.

Insights from the US: shuffle, ownership, and differentiation

Another fruitful project was that Spotify looked at patterns of what paying users were doing in the US that could be given away for free to new users, without the paying users losing the incentive to keep subscribing. One major insight was discovered in the US: the shuffle feature.

A few years earlier, iPod Shuffle had hit the market at a cheaper price with the limitation of listening to all of the ones’ tracks but in the shuffle mode — a surprise factor with the upside of hassle-free subconscious listening to music in the background.

Macworld San Francisco 2005 The iPod Shuffle Introduction by Apple

Interestingly enough, 60% of Spotify’s users in the US were using the shuffle mode to listen to playlists, allowing Spotify’s randomizing algorithms to work — a habit of their iPod shuffle days perhaps combines with algorithms that worked to optimize listeners’ subconscious listening experiences.

In its research, Spotify discovered that it was delivering a hedonic sense of ownership to its users. Unlike the linear radio-like services, users easily found songs, put them in libraries, and felt like they were amassing a limitless library that was owned by them. This was a massive point of differentiation for Spotify that helped it take away share mainly from traditional broadcast radio stations and grow the overall listening time.

Furthermore, at the time, there was no freemium app in the App Store where listeners could find music, save it to a playlist, and listen to those playlists for free, even in the shuffle mode or with some other tiny restrictions. Consumers had radio services, which were somebody else curation, or premium experiences that forced them to purchase pieces of content (i.e. Apple). This point of differentiation helped Spotify’s product stand out in the marketplace.

The curious circumstance of licenses

At the time, a US legislation called the Digital Millennium Copyright Act (or D-M-C-A), allowed radio-like services such as Pandora, in the US, under certain and very specific circumstances and rules, to stream music on mobile devices for free. And one of those rules was that: listeners couldn’t predict or control the order the songs play.

And this rule was aligned with the shuffle play feature on Spotify where users could build playlists but had no control over how often they heard a particular artist or song or in what order the tracks were played. Moreover, from the user’s perspective, it wasn’t radio and was different from what the competitors were offering.

By 2013, Spotify had created an experience in which consumers felt they could freely curate playlists and listen to them, while from the rights holder's perspective, it wasn’t so different than what they’ve already licensed to a lot of radio services.

Spotify had discovered an experience that to consumers could be marketed as what they wanted or a near perfect listening experience and to labels as a radio service

Launch of the app in 2013

Consumers’ listening habits are seasonal and cyclical — every new year holidays and during the summer, Spotify’s desktop usage would take a hit as people spent more time outside using their phones to stream music and would return to normal in the fall.

But in the fall of 2013, consumers didn’t go back to the desktop client and remained mobile. Looking at the dropping user numbers and realizing the urgency, the labels quickly agreed to Spotify’s freemium mobile app, and since October 2013, when Spotify launched its freemium mobile applications, its active mobile user base has grown rapidly and consistently.

5. Becoming Artificially Intelligent

The shift from ‘curation’ to ‘recommendation’

In 2016, it felt that the world had moved from content curation platforms such as the likes of Facebook and Pinterest where consumers did all the work by following people and building collections, to intelligent and personalized recommendations such as YouTube. Spotify also realized that a free product for a playlisting world wasn’t going to be enough for a machine learning world where the likes of YouTube exist — competition was further innovating distribution.

Spotify's A/B tests had proven that licensing the entire catalog for free and on-demand ultimately resulted in the highest conversion rates to premium subscribers, but this wasn’t an offering that rights holders would agree to participate in and would be too costly for Spotify to take on alone. Considering legal loops, Spotify provided users to curate their playlists across the entire catalog range and restricted the experience to shuffle play.

Spotify realized that instead of providing the entire catalog with millions and millions of songs on-demand, using Machine Learning, it could deliver a very small selection of songs, like 100–200 tracks, that are algorithmically selected for that specific user to deliver the best listening experience that optimizes as consumers engaged more with the content. This resulted in the creation of algorithmically generated and personalized playlists including Discover Weekly, Daily Mix, and Release Radar.

The introduction of these playlists was a big bet by Spotify, but necessary to experiment with AI and algorithms to position itself for the future of content delivery. In the early days, experience improvements were minimal and it was obvious that tangible results would be realized in the mid-to-long term. Compared to the value the free tier was delivering, these playlists had small engagements, however, the senior leadership at Spotify invested large resources into AI-driven content delivery.

Before the development of Spotify’s recommendation engines, it was a product that gave only a music superfan the perfect listening experience — someone with a knowledge of bands and genres, who keeps up with the latest releases and enjoys spending hours combing tracks and putting together playlists. But this segment would never get Spotify to the mass market — and Spotify needed to deliver amazing experiences to casual listeners, at scale — through AI and personalization.

Collaborative filtering at scale

In the initial days of experimenting with AI for recommendations, Spotify felt that access to a massive catalog, a well-functioning search bar, and advanced playlist-creating tools provided a near-perfect experience. Hence, delivering AI-driven recommendations was a secondary priority for Spotify.

To add to the complexity, at the time, no one had implemented collaborative filtering or matrix multiplications at scale to hundreds of millions of playlists in the music and audio industry. For this to work, Spotify needed to figure out the operations of breaking its user-generated playlists into many computations on various clusters of computers and join them to build approximations of working recommendation algorithms.

In the world of Spotify and music entertainment, collaborative filtering refers to finding trends and patterns when a large group of users puts the same bunch of tracks next to each other on the same types of playlists over and over again. The assumption here is that these audiences are telling you that such tracks go well together and probably have something in common. Algorithms use these patterns to figure out how two tracks are mathematically similar, based solely on how often they appear on the same and/or similar playlists.

Large data pools with major preparation challenges

Spotify’s advantage, which somewhat no one else had, was its library of hundreds of millions of playlists already back then, arguably the largest music curation database in history, growing larger every minute to over 3.6 billion playlists today, which would layer an amazing starting point for ML related experimentations to work with.

However, the database came with its noise, where there are always listeners that bundle tracks such as Rock with Christmas gingles that at enough scale, distort Spotify’s algorithms. This is always the challenge with throwing Machine Learning at data that you don’t fully understand (i.e. non-prepared or cleaned data). Throwing AI/ML at data as a BlackBox will rarely churn out a great product.

While at the time, those initial algorithms were good enough, most times creating amazing and unintuitive suggestions that no human would’ve ever found, they also made simple mistakes that no human would’ve made. There was still more room for improvement.

Understanding the customer's voice

To truly improve recommendation accuracies, Spotify needed a technology that could understand music in ways that collaborative filtering with playlists wasn’t capturing. For this, Spotify partnered with Echo Nest. Echo Nest added two new initiatives to better understand music from the customers’ perspective:

  • Figuring out how audiences describe music: through words — and for that, they crawled relevant written content on the Internet including blogs, reviews, etc. to identify how music and artists were described by tracking sentiments in combination with natural language processing. This revealed new insights including, for example, the connection between various artists' music at a high level and on the micro-level, their singles. This was done at scale, targetting hundreds of thousands of artists and millions of singles and understanding the very niche audiences beyond what humans could ever evaluate
  • Understanding and matching music with human biology: Echo Nest crawled through millions of Spotify’s songs and broke them down acoustically into chunks of little windows and looked at all the characteristics of the song’s biology such as tempo and beat and pattern of the music. This way, it could match the genetic and biological needs of the particular listener, with the biological prints of songs

With these two initiatives, Spotify was able to describe what a listener meant by a song/track being happy, by an artist being cool or hip, and match those words with the tempo and biology of songs. This meant that Spotify finally had a truly ML-first product by 2016, which lead to the launch of Discover Weekly, a fully AI-recommended playlist individualized for every user.

Soundtracks vs. Discover Weekly

Discover weekly worked well, with the music superfans, because it was the superfans that created the most playlists, which created bias where despite invested efforts, Discover Weekly could not scale to the mainstream listeners and capture their attention. The improvement with Discover Weekly was only incremental.

To overcome this problem, Spotify began using its human-curated playlists as use cases such as:

  • Running
  • Deep focus
  • Dinner with friends
  • Driving

The superfans built their playlists in the form of music genres such as EDM or Hip Hop. But the general public consumes music based on the time of day, their mood, and/or the work they are performing. Therefore, Spotify pivoted into using playlists that fitted into the daily routines of its listeners.

This approach curated an ecosystem of playlists that helped listeners find music for as many moods, moments, modes, activities, genres, and lifestyles as possible. Every playlist has a goal taking into account the user intent and context and strategizing its selection and sequencing. Each playlist told a story and captured a dimension of the users’ everyday life.

These playlists were named ‘Soundtracks’ and using the algorithms that had helped scale Discover Weekly, soundtracks were scaled and personalized and became the solution to delivering targeted AI-driven recommendations to mass audiences.

Reinforcement learning — where Spotify is today

Today Spotify is experimenting with reinforcement learning to use historical data to predict the long-term future of music listening trends. This means that Spotify is trying to predict our future listening habits, tastes in music, and willingness to pay.

For example, Spotify models its users based on demographics, motivations, and psychographics and matches them with a current level of satisfaction with the listening experience. Then, looking at hundreds and thousands of similar segments of audiences, it tries to predict how likely or happier specific users will be if a certain feature or service is delivered at a specific point in time or location to monetize the service or capture listening time from the competition. Furthermore, based on historical trends, Spotify can pinpoint what interaction in the future with specific customers can derail satisfaction and prohibit this from happening.

Reinforcement learning has helped push Spotify towards becoming a subscription business rather than only a freemium and ad-services-driven business model.

Spotify’s total vs premium (i.e. paying) subscribers — figures in millions — Source: Spotify’s investor reports — the quarterly growth rate of premium subscribers is larger than the total subscriber growth rate
Share of Spotify’s total users as premium subscribers — figures in millions — Source: Spotify’s investor reports — between Q1–2015 and Q1–2018, Spotify experienced a high degree of premium subscriber conversion rate but since then it has plateaued
Share of Spotify’s total users as premium subscribers — figures in millions — Source: Spotify’s investor reports — between Q1-2015 and Q1-2018, Spotify experienced a high degree of premium subscriber conversion rate but since then it has plateaued

6. Cloud migration and managing massive change

Spotify’s Journey to the Cloud (Cloud Next ’18) — by Google Cloud

While Spotify’s hybrid client-server and peer-to-peer capabilities allowed it to stream music fast enough that helped switch consumers to Spotify from illegal downloads, gradually they became an operational bottleneck as Spotify scaled.

In the early days, Spotify’s central servers only needed to serve a small portion (~10%) of all the listening happening on Spotify and this helped the system become much more fault-tolerant as it could always fall back on peer-to-peer if the central servers failed.

But even so, scaling the central network of Spotify’s servers and data centers was an operational challenge and would easily overload during peak hours. Consequentially, Spotify’s infrastructure teams were generally putting out fires rather than thinking about building new features or building a long-term strategy and vision. In addition, other sections of the Spotify technology stack needed to wait for infrastructural capabilities to develop before experimenting with new features.

Beginning in 2014, as consumers migrated to mobile, the degree of strain on Spotify’s server base grew from the initial lows in the early days into a major bottleneck and liability. In the meantime, as Cloud Computing got faster and cheaper, the competition had caught up with Spotify, and running centrally owned and operated data centers were proving costly and uneconomical.

This situation also developed a culture of complacency and ‘it-is-what-it-is’ or ‘this-is-who-we-are’ work environment that trumped an impetus for growth that would translate into long-term competitive differentiation, which held Spotify back for quite some time. Migration to the cloud had always been discussed and at times tried out, but no massive conviction was invested to move the organization forward.

The operational challenges of migration had also become a point of fear and worry among Spotify’s senior leadership as they would need more than two years to first keep the old system alive and build a parallel team moves all of its services from the old system to the cloud — somewhat like running two companies at the same time for two years, in parallel, across two different organizational mindsets, one set of employees coming to the end of their careers while the other just beginning.

Daniel Ek’s leadership

Eventually, Daniel Ek made the migration a major priority and kick-started the project with his full attention and support. Daniel’s main inspiration was not cost-saving but the growth potentials that cloud migration brought to Spotify. He had concluded that if enough attention is delivered to the project and the migration completes, then Spotify can be vastly larger than what it was on its centralized server networks and data centers.

With the decision to migrate made in 2015, Spotify kicked off its migration project in early 2016. Spotify knew that:

  • The migration would be a lengthy and painful process that may take up to 3 years
  • It would be expensive with Spotify running two parallel infrastructures and DevOps teams
  • Most if not all development projects will be on hold until the switch was made, which considering the pool of talent at Spotify, and the technology development culture it had set up from day-1, would have been a challenging transitional period

To add flavor to the organizational change impact of the migration, the Dev Ops teams which were powerful stakeholders at Spotify at the time, and were understandably, against the migration. Partly because of a fear of losing their jobs and that they were hired to build cool new infrastructure technologies which would be halted thereafter.

Consequentially, those passionate about working on infrastructure exclusively had to choose to move on from Spotify, and those who remained had the chance to work on a whole new set of problems.

Partnering with Google

While at the time, Amazon Web Services, the market monopolist, was the obvious choice for many companies as a cloud computing partner, it wasn’t the right choice for Spotify. Spotify decided to partner with the new entrant, Google Cloud, which wanted to compete with the market monopoly, for three main reasons:

  • Because Google Cloud was just starting, Spotify became one of Google’s main and high-priority clients, that in turn meant Spotify could tailor its migration roadmap to a degree that Amazon was not willing to cater for. This allowed Google and Spotify to co-develop features and products higher up the stack that helped enhance the user experience. While Amazon’s early lead customer was Netflix, Spotify, served the same purpose as Google Cloud
  • By supporting the new entrants, Spotify prevented itself from getting squeezed economically by a market monopolist
  • The organizational cultures of the two organizations matched. Not just on the senior leadership level, but also on the working level where engineers had to work together daily

It took Spotify more than 5 years to complete its full migration to the cloud.

Enhanced analytics: the positive side effect of the migration

A major area of focus for Spotify during the migration was to develop a base for better data both for Spotify and the subscribers, to help everyone in the ecosystem make better listening decisions.

In the centralized data center and server network of Spotify, there was little visibility into the details of the available datasets and consequentially, no one could visualize out-of-the-box data requests.

The migration to Google Cloud helped bring transparency and structure to Spotify’s data which in effect helped employees ask more complicated questions about users, interactions, and Spotify’s offerings. This unconstrained access to data and transparency led to exponential growth and productivity across various teams and departments.

Although this placed pressure on Spotify’s Data Intelligence team to leverage the massive amount of data available by finding, joining, and ensuring data quality, operating at scale, and delivering meaningful insights, it also propelled the organization to new levels of understanding customers and delivering propositions that stick. This is a testament to Spotify’s culture of letting people use and do whatever they want with data instead of constraining them to things they can work with.

This culture is to a great extent the Swedish culture of democratizing access to information, which led to music piracy in the mid-2000s in Sweden, but has also propelled Spotify to new heights when used effectively. Today at Spotify, rather than creating constraints on who can create and access what kind of data and how often, it is common for everyone to have access to dashboards of real-time data across the organization.

7. Delivering a seamless hardware experience: Spotify Connect

With most music listening always happening in-car and at home, back in 2011, it was still hard to figure out how to play digital music over a decent set of speakers. Consumers would often settle for listening to MP3s over computer speakers or earbuds — in other words, convenience trumped quality. In 2011, Spotify set out to imagine a world where music could seamlessly flow from one device to the next as consumers go about their daily listening habits, from desktops to smartphones, to home speakers, to TVs, etc.

Source: IFPI

Partnership with Sonos

In 2011, the software hadn’t penetrated home speaker systems yet, so there was no readily available platform for technology companies to write codes on. The speakers market was fragmented with 100s of manufacturers making 1000s of speakers, most of which were not connected to the internet and relied on Bluetooth connectivity that was buggy, slow to connect, and routed everything through phones or computers. Except for Sonos, the only luxury smart speaker that connected directly to the internet to stream content.

While Sonos’ connected speakers were expensive for most consumers at the time, Spotify was betting that eventually, based on Moore’s Law, prices will drop enough so that everyone would leapfrog from MP3 players and home stereos to wifi-connected speakers. And if Spotify could ship working integrated software, it could capture the speaker market to come.

Initially, Spotify, followed Sonos’ standard operating procedures to integrate itself into Sonos’ user interface. For this, Spotify pared down its client desktop app into a small library so that it could run on the speakers’ little chips. Sonos users could access their Spotify libraries through a one-size-fits-all UI that rolled all streaming music services into the Sonos app experience.

Sonos Controller for iPad review

But gradually tensions arose — Sonos was betting on a future where it was the only stand-out speaker in a sea of music services, while Spotify wanted to be the only stand-out music service in a sea of speakers. While both companies shared the vision of providing a flawless listening experience — Sonos had no real incentive to maintain or upgrade any kind of custom Spotify interface in their app or hardware devices while Spotify needed to differentiate its experience because it lacked original content.

Gradually, the two companies came to the understanding that their natural roles complemented each other, with Sonos’ best move, in the long run, aiming to help Spotify reach and engage with users on its devices, and not make it a battle over whose software should control the experience. Sonos granted Spotify the ability to control the discovery part of the listening journey on its speakers.

Spotify’s new software stack: Spotify Connect

To build the perfect listening experience, Spotify initiated a user-centric streaming protocol, as opposed to the dominant device-centric model of the time, that allowed all users’ wifi-connected devices to see what the account was playing and seamlessly control and move music between any device — and if needed, the smartphone could serve as the remote control. This seamless experience became Spotify Connect.

With Spotify Connect, users only press play on Spotify once in their life — the first time they use the service; then they pause it, rewind and forward, switch devices including the smartwatch or TV, etc. The user’s listening session doesn’t live on any single device, it lives in the cloud, and it’s all one long listening session across various hardware devices. While the device-centric model introduced listening frictions as the user changes platforms and devices.

How to use Spotify Connect — CNet

But this wasn’t good news for OEMs, as Spotify could crash their devices using their static memory allocation, high CPU usage, high RAM usage, and complexity in the software codes that Spotify needed to address. However, due to the fragmentation in the connected speakers market and the dominance of Spotify in streaming, the OEMs bent to Spotify’s requirements and made Spotify their dominant streaming partner, allowing Spotify’s user-centric protocols to dictate user experience on their devices.

The long game: the rise of Voice-Activated Speakers

Between 2017 and 2018, smart speakers got adopted in the mass market, especially in the developed markets, and by then Spotify Connect was almost on every large manufacturer's device, ensuring Spotify’s unified listening experience. But it took seven years of engineering and resource dedication for Spotify to get to that point, which is a testament to a strong conviction by Spotify’s leadership and the stamina to continue investing over a long horizon to bear fruit. A long time between inception and actual commercial impact, all because Spotify had a clear hypothesis and rationale to see it through.

Smart speaker unit shipments worldwide from 2016 to 2021(in millions) — Source: S&P Global

Today, Spotify Connect is found in more than 2000 different devices from over 200 different brands including on PlayStation, in cars, or Apple Watch.

The complexity of designing Spotify Connect at scale

Designing the full listening experience of a customer during a daily routine can be complex. For example, listening to content in the car vs. over a voice speaker vs. on the TV are different moments in the day with different user needs with three different listening session types. And Spotify Connect needed to hide all these varying complexities from users and deliver on their listening expectations — or the perfect listening experience — no matter where or when.

Spotify Connect meets the design principles of Spotify to deliver a unified, human, and relevant listening experience to listeners:

  • Unified: means the user feels content is streamed from the same place with full access to Spotify’s catalog on any device
  • Human: means that on any device, users feel the human aspects of their interactions with Spotify rather than feel they are interacting with machines
  • Relevant: means on any device that the user is, the way the interface treats them and helps them navigate the experience is optimized and unique to the occasion. For example, the desktop or web version is for super fans who want a lot of details, the smartphone is for mobility, smartwatches for running, the TV for dinner time, and smart speakers for ease of background listening

8. Venture into podcasting

In 2019 Spotify expanded its scope from music to audio by:

This was Spotify moving from investing in distribution and access technology toward content curation and differentiation.

Scope of business in the early days

While Spotify had initially chosen to tackle the experience in the music industry, in investor memorandums, in the early days, it provided the optionality of pivoting into TV and video as there was the belief that its distribution prowess was appealing to all forms of content. While eventually since its inception in 2006, the goal had become to deliver the perfect music listening session, it could now become the perfect listening session.

Before podcasting, Spotify had managed to take commodity music content and deliver differentiated user experiences through enhanced distribution, recommendation, and availability on various platforms. But in the late 2010s, the reality was that every music and audio marketplace had Justin Bieber, including YouTube and radio — and the listening experience was somewhat similar. Hence, Spotify had always felt the need to take the competition into the content space — finally becoming a full-house media company.

The problem with music exclusivity and the current state of video

Spotify understood that differentiating itself by offering exclusive songs or albums would be bad for listeners and the business landscape as it would provoke competitors toward breaking ‘catalog completeness’. This was at odds with the perfect listening experience Spotify was aiming for.

While exclusivity in music was unviable, in the video landscape, where the consumption objective is a single showing, exclusivity had brought about the golden age of TV, giving consumers who used to pay 20 dollars for 2 hours of movie entertainment, almost unlimited movie quality entertainment at just $10/month.

When Netflix came out with video streaming in 2007 (i.e. distribution innovation), it was incredibly hard to reliably deliver video over the Internet. But a decade later, the industry shifted from online access to video as a key innovation to access as a table stake, and the focus shifted from distribution to content where every player is focused on taking back their content rights, building their owned and operated services, and competing on having more content hits.

Case in point: Disney Plus scaled massively on its first day because of the strength of its content built off the prior decade of innovation in access :

  • It took Disney Plus a few hours to get to 10 million subscribers
  • It took Netflix seven years to add 10 million users in a year, and
  • It took Hulu six years to get to 10 million users in total

Today, in the video streaming landscape, the competition is in content quality, exclusivity, and differentiation and not in distribution.

Initial insights: audiobooks

In Germany, audiobooks were showing up on Spotify’s top 100 charts. This was because recorded companies that owned a lot of audiobook rights, to make more revenues, were uploading audiobook chapters, for free, as song tracks, on Spotify — a sign that consumers and rights holders viewed Spotify as an ‘audio marketplace’ and not just music streaming platform to distribute their content.

And this was all while the platform was not optimized for audiobook listening with no skipping 15 seconds within the chapter, only stream at normal listening speeds, and forced shuffle play when using the free version.

Spotify’s user insights showed that the more people listened to audiobooks, the more they listened to music and vice versa, and consequently, the more they listened, the more they paid. Spotify’s research into consumers’ in-car listening traits indicated that people mostly listen to the radio while driving to hear the news, traffic updates, and weather reports —in other words, just like in Germany when listeners listened to audiobooks along with music, people listen to music and other forms of spoken content while driving.

And the listening experience with spoken audio content hadn’t been innovated for years, other than lowering barriers to access for a new generation of podcast producers.

The rise of podcasting in the US, 2006 2021 — Source: Edison Research
The rise of podcasting in the US, 2006-2021 — Source: Edison Research

An audio-first company: from music to podcasts

To become a spoken audio content powerhouse with exclusive and differentiated content, Spotify needed to bring about business model innovation for its stakeholders across the ecosystem. It needed to figure out a variety of new challenges including:

  • How to scale podcast audiences?
  • How to deliver on catalog completeness?
  • What new product features and experiences to build?
  • What new licenses to deal with?
  • How to build assets for content production?
  • How to personalize spoken content?
  • How to monetize spoken content and satisfy advertisers and artists?

Going super-app

Initial talks at Spotify were around having a centralized super-app for all listening needs rather than having a specialized pod-catcher as it made it easier to scale podcast listeners and met their daily listening habits in a one-stop-shop product.

What is a super app, and why haven’t they gone global? | CNBC Explains

Looking at the competition, while there were several pod-catcher applications, Apple Podcast controlled the majority of the active podcast listening time. This meant that Spotify needed to have a differentiated strategy to reach consumers and engage and retain them. And that’s when the standalone product option was ruled out because the current standalone podcatchers only captured a small share of the market.

Furthermore, integrating podcasts into the current Spotify app aligned to deliver a frictionless and perfect listening experience. Rather than have Spotify’s users toggle apps, they could listen to all their audio needs, across the day, in one single product.

The virtuous cycle

The belief at Spotify was that with its scale of users, the more they spend time listening to podcasts, the more monetization chances for content creators, therefore more exclusive content, and this would again mean more consumers, and these network effects continue to help Spotify scale.

While there was a debate that podcasts may reduce overall music consumption and engagement, the general belief was that unless Spotify stops podcasting from taking off, it’s better to align with it than to fight it.

The new features

The minimum consumer expectation for migrating from Apple Podcasts to Spotify was to have catalog completeness. This was a straight tackle for Spotify as all podcasting apps use a single directory from Apple that has a list of every show ever created.

While most if not all podcasting apps download the content onto smartphones, Spotify decided to stream podcasts, delivering a faster experience and using the core of the proprietary technology it had built over years.

To convince the podcast owners to host their content on Spotify, it created new tools for monetization and provided data insights that they weren’t getting at other places, while simultaneously delivering better advertising tools for collaborating partners.

To acquire and engage new podcast listeners, Spotify decided to invest its marketing budget into offering free access to spoken content rather than placing them especially the exclusive ones behind a paywall. The positive impact of this move has been that as audiences listen to more spoken content on Spotify, its personalization and recommendation algorithms deliver a better podcast listening experience compared to other platforms in the long run, which will become a massive differentiator. Furthermore, the more consumers consume podcasts on Spotify, the accumulated data intelligence will help guide the curation of future materials which has mostly been a creative, artistic, and shotgun approach up to now.

While Spotify had just recently made the move to host podcasts, the expectation is that in the next 2–3 years, as it monetizes its podcasts streaming business line, with more money to invest into content, we may start seeing never before seen quality audio content on Spotify that could rip into Apple’s market dominance.

Why Spotify Is Betting Big on Podcasts in Battle Against Apple — by Bloomberg

Better possible revenue options

While today Spotify’s income is mostly generated from music streaming, this is most likely to tip toward podcasts in the future. When it comes to music, Spotify shares a large portion of its income with rights holders, however, with the move it made on podcast content curation, the expectation is that it will retain a larger share of podcast revenues as the business scales.

9. Moving towards an interactive audio platform

Content distribution is not a major challenge today and an analytics dashboard is merely table stakes. With a focus on creating quality and exclusive spoken content, Spotify is focused on empowering creators with better content creation tools and listeners with interactivity options. The desire to turn podcasting into an interactive dialogue between hosts and listeners is something podcasters have wanted to do for a long time.

Major step: the acquisition of Anchor

RSS is currently the primary way that most podcasts are distributed — a string of letters and numbers that allow the audio content to be crawled over the internet and played on any podcatcher. RSS has allowed the industry to grow despite the fragmentation in podcast players, hosting platforms, and hosts and the listeners get to experience a somewhat consistent listening experience across all these platforms.

While RSS technology allows quick and easy distribution, audio distributed over RSS can’t be easily segmented, secured, expired, or personalized for specific audiences. You can’t even tell how much of an episode a listener listened to, if they listened at all, whether the file was downloaded or not, or who is profiting from the content. RSS is purely one-directional and people can’t communicate back. Therefore, the ‘perfect listening experience’ can’t be created using RSS.

Furthermore, it’s quite difficult to innovate on top of RSS. The case of RSS and audio is very similar to SMS and text messaging. While SMS allowed the telecom industry to scale text-based communications, it limited rich communication compared to what Whatsapp or iMessage offer. What these innovators did was to build their software stack and wrapped the whole communication chain — from the creator of the message to the consumer of the message — inside one company and one software stack. Each company runs its protocols that allow them to add new features at a faster pace, based on its users' needs and the business model that makes sense to them.

And since there was a huge demand from the curators' side to move away from RSS, Anchor was launched in 2015 using its custom protocols. Anchor was a success with podcast creators from day one as it removed content creation friction through simple-to-use tools including adding sound effects, music, polls, Q&As, and adding live audiences.

In 2018 Anchor launched its smartphone app that further lowered barriers for all audiences to create quality podcasts. By the end of 2018, Anchor was powering nearly half of all new podcasts being created.

The problem Anchor had, however, was the scale of listeners to innovate on the format end — and that is what Spotify could offer and the acquisition was finalized in 2019.

The move to a social and interactive audio platform

As more and more people launch their media services, networks, and companies, the content will oversaturate and commoditize. Two decades ago there were 100 original scripted series in the United States, 250 a decade ago, and more than 600 last year. While a lot of them are great shows, the problem is no longer making a great show but making sure that consumers view it over others.

And so, as content commoditizes, the audio industry will move toward a platform-based competition where rather than only being a business of creating and delivering content— depending on the business model and how much horizontally and vertically the media business is integrated — media companies shift users from passively consuming content to also creating, uploading, commenting on, and sharing them. This is what we have seen time and time again including:

  • With written text going online to blogs, and then Twitter as a platform
  • With photos going online to email attachments, shared drives, and other services and then settling on Instagram
  • With videos going to various services and finally aggregating on YouTube
  • With audio going to various services, some on ClubHouse — however, it doesn’t feel to be the final point of settlement, yet

In short…

In the audio industry, what we haven’t yet seen in full throttle is for millions of people to frictionlessly create and distribute audio content, cultivate audiences over time, and create real economic returns. Spotify has its eyes focused on lowering barriers to creating quality audio content while making audio a more interactive, accessible, and instantaneous experience — and this just looks like the beginning.

10. Spotify’s performance

Global audio streaming revenues — 2013/2021 — figures in billions of USD — Sources: IFPI and Spotify
Global audio streaming revenues — 2013/2021 — figures in billions of USD — Sources: IFPI, Spotify

Spotify had comfortably controlled +70% of global streaming revenues since 2013, despite some minor setbacks between 2017 to 2020.

Global recorded music industry revenues, 2001–2020, figures in billions of USD — Sources: IFPI and Spotify
Global recorded music industry revenues, 2001–2020 — figures in billions of USD — Sources: IFPI, Spotify

Spotify held 42% of global music industry revenues in 2020, the biggest number held by a single company in the history of the music industry.

Spotify R&D expenses vs. revenues — Source: Spotify
Spotify R&D expenses vs. revenues — figures in billions of USD— Source: Spotify

Spotify has consistently invested on average ~10% of its income into R&D over the last 8 years. To put this into context, in 2020 Spotify invested 26% of Apple Music’s revenues and 10% of radio stations' ad revenues in the US into R&D.

Global users for the top music streaming services — Sources: Company data, Edison Trends
Global users for the top music streaming services — figures in millions— Sources: Spotify, Edison Trends

Spotify is comfortably the fastest-growing streaming business in the world.

US users for the top music streaming services — figures in millions — Sources: eMarketer
US users for the top music streaming services — figures in millions — Sources: eMarketer, MIDIA

In the US, despite the presence of technology giants and music streaming players with a local head start, Spotify is still the largest player by the volume of subscribers.

Number of Spotify subscribers worldwide from 2015–2021 — figures in millions — Sources: Spotify
Number of Spotify subscribers worldwide from 2015–2021 — figures in millions — Sources: Spotify

Spotify’s premium user base is the faster-growing segment, driving the overall user growth of Spotify, justifying that if provided the right user experience, consumers are willing to pay for commoditized content.

Spotify ARPU by subscriber segments — figures in USD — Sources: Spotify
Spotify ARPU by subscriber segments — figures in USD — Sources: Spotify

Spotify’s unit economic returns (income per user) has got significant room for development. This is mainly due to the impact of music piracy which is deemed digital music as a commodity and somewhat to be free. Spotify has shown an inability to increase its unit economy returns, which is a remnant of the fear it has of pressuring users for more income, and that is why it's moving into podcasting and the bigger audio landscape.

Spotify vs. Apple scale of podcasts consumers in the US — figures in millions — Sources: eMarketer

However, forecasts are optimistic that Spotify will trump Apple in the podcast landscape, which delivers the optimism that it may fix its unit economic returns in the short term future.

Spotify’s adjusted stock market since IPO — Source: Yahoo! Finance

Despite the optimism during the COVID period on Spotify’s growth potential, the public markets have adjusted based on Spotify’s performance since.

Spotify needs to address this market pessimism as the market will judge, to what extent, software can further eat the audio industry and churn out economic value

Sonos CEO on tech competition: ‘We’re the story of software eating audio’

--

--

Nima Torabi

Product Leader | Strategist | Tech Enthusiast | INSEADer --> Let's connect: https://www.linkedin.com/in/ntorab/