Intro
As a music loving data geek who uses Spotify, I often feel disappointed with the platform’s lack of user statistics. There is currently no way for users to see any statistics relating to their streaming habits within the app. The only feature that comes close happens once a year in December when they release Spotify Wrapped, which will show your top artists, albums and songs streamed that year and create a playlist based on your year’s top tracks.
That is why in 2018 I joined Last.fm, a platform that can connect to your Spotify account and from then on it tracks every song you stream - which it calls ‘scrobbles’. It also provides recommendations which I personally find to be superior to Spotify’s recommendation algorithm. Once your scrobbles start rolling in, you’ll be able to see lists and grids of your top artists, albums and tracks in any given time frame.
But the real power of harnessing this data comes with the Last.fm API. Here are just a few samples of amazing tools the last.fm / developer community has built on the Last.fm API: {link lastfm data tools examples built on lastfm API}. Here is a somewhat more comprehensive list of currently working last.fm tools: {link Reddit post}.
A simple and practical tool is benjaminbenben’s Last.fm to csv. It is exactly what it sounds like: you input your Last.fm username and it outputs a downloadable csv file of your entire scrobbling history.
Flourish is a service that produces beautiful data visualizations and animations with your data.
The goal of this script is to convert my Last.fm .csv into Flourish-readable data to visualize my Top Artists over time in a ‘Bar-Chart-Race’ Style Animation.
Raw Data
The csv data includes the artist, album, track and date in format DD MMM YYYY HH:mm
Script - Dependencies & Skeleton
import csv, arrow
from pandas import read_csv
csv_filename = "lastfm-aug15.csv"
def prep_csv(filename):
# add header
process_csv(filename)
def process_csv(filename):
pass # Insert Script Below
process_csv(csv_filename)
Script - Add Header with Pandas
This uses pandas to add the header: 'artist','album','track','date' to the first row of our csv file.
def prep_csv(filename):
df = read_csv(filename)
df.columns = ['artist','album','track','date']
df.to_csv(filename, index=False)
process_csv(filename)
Script - Read Data
1) Creates artists as a dictionary type with {artist : date}, and dates as an array.
2) Reads in csv values for each row using arrow to format the datetime format (here we only care about month and year).
3) Adds a new artist to the artists dictionary if that artist is not already in there.
4) Starts counter or adds to it for an artist in a given month.
5) Adds the date to the dates array if needed.
def process_csv(filename):
#1
artists = {}
dates = []
#2
with open(filename, 'rt', encoding='utf-8') as f:
csv_reader = csv.DictReader(f)
for row in csv_reader:
artist = row['artist']
month = arrow.get(row['date'], 'DD MMM YYYY HH:mm').format('MMMM')
year = arrow.get(row['date'], 'DD MMM YYYY HH:mm').format('YYYY')
date = f'{year} {month}'
#3
if artist not in artists:
artists[artist] = {}
#4
if date not in artists[artist]:
artists[artist][date] = 1
else:
artists[artist][date] += 1
#5
if date not in dates:
dates.insert(0, date)
Script - Accumulate Counts
Counts all artist's plays in a given month (date) and accumulates them into the values of the dictionary artistsTotal.
artistsTotal = {}
for artist in artists:
sum = 0
artistsTotal[artist] = {'artist':artist}
for date in dates:
if date in artists[artist]:
sum += artists[artist][date]
artistsTotal[artist][date] = sum
Script - Write Data
Writes a csv file with artists in the first column, and month(dates) in the rest of the columns, with cells containing the sum of plays of that artist in that given month.
dates.insert(0, 'artist')
with open(f'{filename}-processed.csv', 'w', encoding='utf-8') as out:
csvOut = csv.DictWriter(out, dates)
csvOut.writeheader()
for artist in artistsTotal:
csvOut.writerow(artistsTotal[artist])
Processed Data
Artists in the first column, and cumulative plays each successive month in the next columns.
Upload Processed CSV to Flourish
Flourish lets you customize most aspects of your visualization, including bar colours, images, speed, etc. Play around with it, here's what I got:
Comentários