17.2.10

Music recommendation dataset by Oscar Celma & last.fm

Thanks to a FB comment by Jose Carlos, I have discovered this really interesting dataset. Oscar Celma, Ph.D, Chief Innovation Officer @ Barcelona Music & Audio Technologies, has written a very interesting thesis on Music Recommendation, with the plus of making his dataset available to the community.

This dataset contains tuples (for ~360,000 users) collected from Last.fm API, using the user.getTopArtists() method, and it is ~543Mb. The data is made available for non-commercial use. As it essentially features <user, artist, plays> tuples, it is very interesting for testing a number of collaborative recommendation techniques and functions (recommending music & artists). However, I miss it was date-tagged, in order to examin how a system would evolve regarding recommendations and feedback.

It contains 17,562,018 tuples, beware if you are going to use non-sparse representations.

2 comentarios:

Foafing your music dijo...

"However, I miss it was date-tagged, in order to examin how a system would evolve regarding recommendations and feedback."

That's the second dataset I'm planning to release, but it's not as easy as it contains *lots* of scrobbles (the whole listening history) for one thousand users.

Hopefully, soon (I'm still discussing with last.fm to do it right) it'll be also available to download.

It'll include user, artist name, song name and a timestamp.

Hasta luego,

Oscar

José María Gómez Hidalgo dijo...

Thank you for your comment.

It is good to know about your next plans for the dataset (which in fact is *very* useful as it is now).

Thank you for your support to the community!