Showing 21 - 40 out of 45 datasets
  • Freebase Data Dumps

    Offsite — Full data dumps of the Freebase shared database of the world’s knowledge. Available in tab-separated values format and a low-level link export suitable for converting into RDF or XML. Creative Commons Attribution license. See also: Freebase API and Freebase Wikipedia Extraction (WEX).
  • Songs Sampled in Girl Talk's 'Feed the Animals'

    Offsite — Information about the 264 samples used in Girl Talk’s Feed the Animals album, along with a few visualizations. CSV format. By Andy Baio.
  • U.S. Copyright Renewal Records

    Offsite — U.S. copyright renewal records, downloadable as a single XML file. By Google software engineer Jarkko Hietaniemi. Public domain.
  • Frequent Itemset Mining Dataset Repository

    Offsite — Anonymized clickstream data from a Hungarian news portal, anonymized retail market basket data from an anonymous Belgian retail store, anonymized traffic accident data and more.
  • LibraryThing Web Services API

    Offsite — RESTful XML-based API for querying the LibraryThing Common Knowledge database of interesting facts about books. Developer key required. Creative Commons Attribution-Share Alike 3.0 license. See the announcement on LibraryThing for more information. See also the LibraryThing Books API.
  • Metafilter Infodump

    Offsite — Collection of data culled from the Metafilter community weblog database: stats on Metafilter posts, comments, tags, favorites and users. ASCII text.
  • Netflix Prize

    Offsite — “The Netflix Prize seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on their movie preferences.” Register to download the dataset.
  • Network Datasets Compiled by Mark Newman

    Offsite — A collection of network datasets drawn from studies of human social networks, dolphin social networks, works of literature, power grids, books, blogs and more. Compiled by Mark Newman, professor of physics at the University of Michigan.

    Offsite — Query an archive of National Public Radio content dating back to 1995. Returns results in RSS, MediaRSS, JSON, Atom and custom NPRML formats. See the announcement on NPR.org for more information.
  • The New York Times Article Search API

    Offsite — The NYT Article Search API provides searchable access to nearly three million New York Times articles from 1981 to the present day. Results returned in JSON.
  • Open Library API

    Offsite — Query the Open Library database, the goal of which is to provide one web page for every book ever published.
  • Oscar-nominated Movie Piracy Data

    Offsite — Piracy data on all 186 Oscar-nominated films from 2003 to 2008. Compiled and published by Andy Baio.
  • outside.in API

    Offsite — The outside.in API provides news articles, blog posts, tweets and more within 1,000 feet of any latitude and longitude in the United States in XML or JSON format. Licensed under a simple Terms of Service.
  • Stanford Copyright Renewal Database

    Offsite — Downloadable dataset of 250,000 records on U.S. copyright renewal for books published between 1950 and 1995.
  • Star Wars Kid Data Dump

    Offsite — Data from Waxy.org’s server logs concerning the initial spread of the Star Wars Kid viral video in 2003. Public domain.
  • White Glove Tracking Dataset

    Offsite — Data on the location and position of Michael Jackson’s white glove in all 10,060 frames of his nationally televised 1983 performance of the song “Billie Jean.”
  • SonicLiving API: Concert Listings

    Offsite — The SonicLiving API gives you access to 50,000+ upcoming worldwide concert listings from over 100 data sources. Flexible parameters let you display shows by artist, by region, by venue and more
  • USA Today API

    Offsite — USA Today’s network will serve as a home for their content APIs and as a community for developers that want to discover, engage with and communicate about their content. After reading our terms of use, USA Today encourages developers to sign up for access and get to coding!
  • New York Times API Documentation and Tools

    Offsite — With the Article Search API, you can search New York Times articles from 1981 to today, retrieving headlines, abstracts, lead paragraphs, links to associated multimedia and other article metadata. Along with standard keyword searching, the API also offers faceted searching. The available facets include Times-specific fields such as sections, taxonomic classifiers and ...
  • The Daylife API

    Offsite — Daylife’s Application Programming Interface provides developers — and code-savvy editors and publishers — a web accessible interface to Daylife’s news aggregation and analysis service. The Daylife API is robust and battle-tested, receiving over one billion calls per month. With Daylife you can focus on your content, brand, and audience while they handle the plumbing, ...