8 datasets
  • Last.fm Music Tags

    Offsite — This is a set of artist and genre tag data collected from Last.fm using the Audioscrobbler webservice during the Spring of 2007. The data consists of the raw tag counts for the 100 most frequently occuring tags that Last.fm listeners have applied to over 20,000 artists. Included are artist tags and genre related tags. An undocumented (and deprecated) option of the ...
  • Jester Jokes and Joker Recommender System Ratings

    Offsite — Jester uses a collaborative filtering algorithm called Eigentaste to recommend jokes to you based on your ratings of previous jokes Three datasets: Dataset 1 contains over 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes, that’s right…jokes, from 73,421 users collected between April 1999 – May 2003, Dataset 2 contains Over 1.7 million continuous ratings ...
  • Free Book Usage Data from the University of Huddersfield

    Offsite — The University of Huddersfield released a major portion of their book circulation and recommendation data under an Open Data Commons/CC0 licence. In total, there’s data for over 80,000 titles derived from a pool of just under 3 million circulation transactions spanning a 13 year period. The data they’ve released essentially comes in two big chunks: 1) Circulation Data ...
  • Internet Archive: Details: Amazon ASIN listing and similarity graph

  • Book-Crossing Dataset

  • AudioScrobbler Data

    Offsite — Audioscrobbler, which is now merged with last.fm, once published a database of what music people listened to with the audioscrobbler plugin. Last.fm no longer publishes it, however the initial releases were in the public domain so I can offer it for download. Here’s the file: http://www.iro.umontreal.ca/~lisa/datasets/profiledata_06-May-2005.tar.gz (135MB compressed, ...
  • Collaborative filtering dataset - dating agency

  • Mobile User Short Message Data of One Mobile Operator in China

    No Data — Mobile User Short Message Data, comes from One Mobile Operator in China. Data mainly includes formal short message and spam message. There are 170229 records of spams and 33588 records of formal messages.