FMA: A Dataset For Music Analysis

Michaƫl Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2.

Free Music Archive web API

All the data in the raw_*.csv tables was collected from the Free Music Archive public API. With this notebook, you can:

  • reconstruct the original data,
  • update some fields, e.g. the track listens (play count),
  • augment the data with newer fields wich may have been introduced in their API,
  • update the dataset with new songs added to the archive.

Notes:

  • You need a key to access the API, which you can request online and write into your .env file as a new line reading FMA_KEY=MYPERSONALKEY.
  • Requests take some hunderd milliseconds to complete.
In [ ]:
import os
import IPython.display as ipd
import utils
In [ ]:
fma = utils.FreeMusicArchive(os.environ.get('FMA_KEY'))

1 Get recently added tracks

  • track_id are assigned in monotonically increasing order.
  • Tracks can be removed, so that number does not indicate the number of available tracks.
In [ ]:
for track_id, artist_name, date_created in zip(*fma.get_recent_tracks()):
    print(track_id, date_created, artist_name)

2 Get metadata about tracks, albums and artists

Given IDs, we can get information about tracks, albums and artists. See the available fields in the API documentation.

In [ ]:
fma.get_track(track_id=2, fields=['track_title', 'track_date_created',
                                  'track_duration', 'track_bit_rate',
                                  'track_listens', 'track_interest', 'track_comments', 'track_favorites',
                                  'artist_id', 'album_id'])
In [ ]:
fma.get_track_genres(track_id=20)
In [ ]:
fma.get_album(album_id=1, fields=['album_title', 'album_tracks',
                                  'album_listens', 'album_comments', 'album_favorites',
                                  'album_date_created', 'album_date_released'])
In [ ]:
fma.get_artist(artist_id=1, fields=['artist_name', 'artist_location',
                                    'artist_comments', 'artist_favorites'])

3 Get data, i.e. raw audio

We can download the original audio as well. Tracks are provided by the archive as MP3 with various bit and sample rates.

In [ ]:
track_file = fma.get_track(2, 'track_file')
fma.download_track(track_file, path='track.mp3')

4 Get genres

Instead of compiling the genres of each track, we can get all the genres present on the archive with some API calls.

In [ ]:
genres = fma.get_all_genres()
print('{} genres'.format(genres.shape[0]))
genres[10:25]

And look for genres related to Rock.

In [ ]:
genres[['Rock' in title for title in genres['genre_title']]]
In [ ]:
genres[genres['genre_parent_id'] == '12']