• Get involved.
    We want your input!
    Apply for Membership and join the conversations about everything related to broadcasting.

    After we receive your registration, a moderator will review it. After your registration is approved, you will be permitted to post.
    If you use a disposable or false email address, your registration will be rejected.

    After your membership is approved, please take a minute to tell us a little bit about yourself.
    https://www.radiodiscussions.com/forums/introduce-yourself.1088/

    Thanks in advance and have fun!
    RadioDiscussions Administrators

Spotify has been scraped---and the data---wow.

I hadn't heard of Anna's Archive before. Apparently their focus is usually books and papers. They explain why Spotify and how here:


While there's likely to be a lively debate between lawyers over the legality of helping themselves to Spotify's audio library, the data is fascinating as well. Here's the 10,000 most often-played songs on Spotify as of the date they scraped:

 
I'm willing to bet that because of the "legals" you referred to, this website won't be around very long. I mean the recording industry saw this coming long ago and made treaties and laws throughout much of the world to insure that owners of sites like this one would be prosecuted, no matter how good the motive. (No, I don't side with the music industry on this--I never have--but courts in both the U.S. and Europe have sided with their efforts 90% of the time [per Nina Totenberg on NPR many years ago] and I don't think they're going to stop now.)
 
By making a copy, they have, by definition, engaged in copyright theft.


They don't own this content, they're not licensed to use it, and therefore they have stolen it.

But copyright law doesn’t care about good intentions. This 300-terabyte archive represents industrial-scale infringement that threatens the licensing agreements keeping artists paid and platforms operational. Spotify’s response-implementing “new safeguards for anti-copyright attacks”-suggests the company is taking heat from record labels who trusted their content to remain protected.

No comment from the RIAA or SoundExchange.
 
Last edited:
I have heard all the tracks where converted to 128kbps OGG WTF nobody now days want music that sounds that awful. I know all the tracks were down converted to save space should have rolled with 500kbps OGG then they would sound pretty decent. I roll with 500kbps OGG for all my music it sound really good a with decent stereo receiver and speakers hooked up to my computer. I also use Winamp with Stereo Tool and good setting to play my music.
 
I have heard all the tracks where converted to 128kbps OGG WTF nobody now days want music that sounds that awful. I know all the tracks were down converted to save space should have rolled with 500kbps OGG then they would sound pretty decent.
OGG? I thought that format died out a few years ago, when the MP3 patents expired.
I roll with 500kbps OGG for all my music it sound really good a with decent stereo receiver and speakers hooked up to my computer. I also use Winamp with Stereo Tool and good setting to play my music.
I still have some songs digitized with 128 kbps OGG, although most have been converted to now-free MP3. My aging ears can't tell the difference.
 
There's an archive of the dead link in the first post here: https://archive.is/4hDoP

And the "top 10,000 songs page": https://archive.is/Qi0eV

There are various alternative domains for the site in question that are still up, but given the forum rules, I won't link them here. The access to academic papers is a key feature of these sites. Even in university environments where there is legal access to journals, the user experience of pasting in a DOI and getting the paper straight away from Anna's or Z Library is quicker than jumping through the various hoops required to access them through the actual journal.
 
OGG? I thought that format died out a few years ago, when the MP3 patents expired.

I still have some songs digitized with 128 kbps OGG, although most have been converted to now-free MP3. My aging ears can't tell the difference.
Spotify's default lossy format is Ogg Vorbis. Free accounts get streams in that format up to 160 kbps and premium accounts get up to 320 kbps. In the event Ogg can't be played or isn't supported, Spotify has a backup option of AAC streaming at 128 kbps for free acounts and 256 kbps for premium accounts. There's no support for delivering streams in MP3 format.

This group said they scraped the 160 kbps Ogg versions of songs, and then reencoded files to 75 kbps if the song's popularity equaled 0 according to Spotify's "popularity metric." The data they presented showed there were over 200,000,000 songs where popularity=0 (with only a portion of the total getting scraped). That's more than 70% of Spotify's catalog, but collectively represents a small fraction of listening. Actually, they go on to say that most of the listens on Spotify come from a mere 0.1% of the catalog.
 
I have heard all the tracks where converted to 128kbps OGG WTF nobody now days want music that sounds that awful. I know all the tracks were down converted to save space should have rolled with 500kbps OGG then they would sound pretty decent. I roll with 500kbps OGG for all my music it sound really good a with decent stereo receiver and speakers hooked up to my computer. I also use Winamp with Stereo Tool and good setting to play my music.
OGG Vorbis and AAC are capable of better sound quality at lower bitrates than MP3. But the main advantage with OGG is that it's an open source format, free to use without royalty payments, like MP3, AAC and others (FLAC also falls into this category for lossless audio).

And HE-AAC and AAC+ go even further, with good quality all the way down to 48kbps. Most radio stations, particularly the ones on iHeartRadio, Audacy, etc., stream in one of these formats. Streaming in MP3 seems like such a waste.

And if you're encoding music with OGG at 500kbps, you might as well use FLAC. Or you could go down to 256kbps. Do your ears really perceive that much of a difference?
 
To me, the songs being in their hands (especially in such poor quality) is the least-interesting thing about the story. It's the data---which I'm sure Spotify considers proprietary---an up-close look at the actual behaviors of listeners to the largest (in terms of subscribers) music streaming service in America.

Put that top 10,000 songs list up on your browser and then use the browser's "find" feature to see where legendary artists are.

Here's the first one free.

Elvis Presley shows up only once in those 10,000 songs...at #1044.
 
Put that top 10,000 songs list up on your browser and then use the browser's "find" feature to see where legendary artists are.
Tried that yesterday briefly with a search for McCartney...

Not sure, but since the dataset is from younger users of that service, does it really correlate beyond it!
 
To me, the songs being in their hands (especially in such poor quality) is the least-interesting thing about the story. It's the data---which I'm sure Spotify considers proprietary---an up-close look at the actual behaviors of listeners to the largest (in terms of subscribers) music streaming service in America.

Put that top 10,000 songs list up on your browser and then use the browser's "find" feature to see where legendary artists are.

Here's the first one free.

Elvis Presley shows up only once in those 10,000 songs...at #1044.
I tried that yesterday, didn't search for specific artists, but was amazed at the number of Billie Eilish and The Weeknd titles in the first few hundred songs. I'm not surprised with the poor showing for Elvis or McCartney, as I doubt very many of the people who are streaming Eilish and Weeknd would ever look for either of those two. And Bad Bunny is also prominent in the results -- virtually no crossover between his fans and Paul or Elvis's.
 
Tried that yesterday briefly with a search for McCartney...

Not to be found. Nor Lennon, nor Harrison (do I need to mention Ringo?).

The Beatles themselves placed once--at #442--with "Here Comes the Sun".

Not sure, but since the dataset is from younger users of that service, does it really correlate beyond it!

The dataset is from all users of that service---and the demographics aren't exclusively young---45% are over 35, 30% are over 45 and 19% are over 55.

So, 81% of Spotify users are within a desirable radio sales demo. And seeing the performance of those "heritage" artists can be instructive to anyone who thinks (Group/Artist/Genre) is forever.

Because---unlike radio---this is what people choose when they want to listen to something.

Elton John shows up five times in those 10,000 songs---the highest at #473, the lowest at #1491.

Michael Jackson shows up four times---the highest at #654, the lowest at #1535.

The lesson is that yeah, there really only are a handful of songs people want to hear from even the most popular, prolific artists of the past---and yeah, time has flown a lot farther than you think.
 
The dataset is from all users of that service---and the demographics aren't exclusively young---45% are over 35, 30% are over 45 and 19% are over 55.

So, 81% of Spotify users are within a desirable radio sales demo. And seeing the performance of those "heritage" artists can be instructive to anyone who thinks (Group/Artist/Genre) is forever.
The Backstreet Boys show up more than any of the "legacy artists" you mention. (8 times, starting at #488). Cher appears twice, with her 1996 hit "Believe" being the only one in the top 5000.

My impression from glancing over this list is that radio really needs to figure out a 90s format.

From these 10k songs, by decade:
2020s: 4888 (top song: Die with a Smile - Lady Gaga)
2010s: 2845 (top song: All the Stars - Kendrick Lamar)
2000s: 1035 (top song: Yellow - Coldplay)
1990s: 548 (top song: Iris - Goo Goo Dolls)
1980s: 342 (top song: Every Breath you Take - Police)
1970s: 248 (top song: Dreams - Fleetwood Mac)
1960s: 94 (top song: Here Comes the Sun - Beatles)
1959 and before: None
 
My impression from glancing over this list is that radio really needs to figure out a 90s format.

From these 10k songs, by decade:
2020s: 4888 (top song: Die with a Smile - Lady Gaga)
2010s: 2845 (top song: All the Stars - Kendrick Lamar)
2000s: 1035 (top song: Yellow - Coldplay)
1990s: 548 (top song: Iris - Goo Goo Dolls)
1980s: 342 (top song: Every Breath you Take - Police)
1970s: 248 (top song: Dreams - Fleetwood Mac)
1960s: 94 (top song: Here Comes the Sun - Beatles)
1959 and before: None

Yeah, absolutely. Because someone who turns 50 this year graduated high school in 1994.

Here's a reality check beyond that one.

At the big peak of early-mid 80s music, I had a serious girlfriend (still one of my dearest friends today) for a couple of years. The week we started dating, Cyndi Lauper, Duran Duran, Huey Lewis, Culture Club, Billy Idol, Prince and the Eurythmics were all over the radio.

She was 23. I was 28.

She'll turn 65 in five months. I'll be 70 in two.

It's okay if anyone reading this thinks any music recorded after (insert date here) sounds like someone tipped over the china cabinet, but there are people with AARP cards who grew up with 90s music, and that can't be ignored.

PS: Not to put too fine a point on it, but here's how long ago 1997 was:





































bafkreigchhewziudll26q6wickfntd7yxyrnjdcky4p4ejnq247lb2r6om.jpg
 
Last edited:


Back
Top Bottom