Showing 1 - 50 out of 188 datasets

Datamob aims to show, in a very simple way, how public data sources are being used.

Their listings emphasize the connection between data posted by governments and public institutions and the interfaces people are building to explore that data.

  • U.S. Census Bureau - 1990 Names

    Offsite — Frequently occurring first names and surnames from the 1990 Census.

    Offsite — Query the AMEE (“Avoiding Mass Extinctions Engine”) database of CO2 data and measure the carbon footprint of anything.
  • Amazon ISBN Similarity Graph

    Offsite — Output of a crawl of’s item similarity API from January 18, 2008 for ISBNs (International Standard Book Numbers). ASCII text and XML. By Aaron Swartz.
  • API

    Offsite — The API provides biographical, contact, and voting information about Republican members of Congress; contact information and member listings for House of Representatives committees; and information on bills and votes in XML. Nearly RESTful; requires an API key which is tied to a GOP Portfolio account.
  • Network Datasets Compiled by Alex Arenas

    Offsite — A collection of network datasets on email interchange, jazz musician collaboration and more.

    Offsite — Developer-friendly real-time estimated time of arrival (ETA) feeds, transit schedules, advisory feeds and trip planning information for Bay Area Rapid Transit (BART), serving the San Francisco Bay Area in California. RESTful API returning results in XML. Licensed under a simple License Agreement with no usage limitations.
  • Sean Lahman's Baseball Archive

    Offsite — Historical major league baseball data, downloadable as a CSV.
  • Baseball Databank

    Offsite — Free historical baseball data, downloadable in comma-delimited text and SQL.
  • Basketball Reference

    Offsite — Basketball statistics and data. Most tables available in CSV.
  • BBC Backstage: Feeds & APIs

    Offsite — BBC TV and radio data, programme catalogue information, search API and more. Available for non-commercial use with attribution.
  • Bureau of Economic Analysis Data

    Offsite — National, international, regional and industry economic data. Downloadable in CSV format.
  • Big Huge Thesaurus API

    Offsite — Query a database of 145,000 English language words for synonyms. Returns data in JSON, XML, serialized PHP array or plain text formats. Based on data from the Princeton University WordNet database and the Carnegie Mellon Pronouncing Dictionary. By John Watson.
  • Bureau of Labor Statistics

    Offsite — Data on labor economics. Most tables available in tab-delimited and comma-delimited ASCII text.
  • U.S. Bureau of Justice Statistics

    Offsite — U.S. crime data, much of it downloadable in CSV and ASCII formats.
  • BookMooch API

    Offsite — Query or download the database for BookMooch, a book exchange community. ASCII text and XML formats. Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.
  • Brooklyn Museum API

    Offsite — RESTful API for programmatically searching the Brooklyn Museum’s digitized collection of more than 10,000 individual works. Free for non-commercial use with a limit of 3,000 API calls a day. Review the Brooklyn Museum API Terms of Use.
  • CalorieKing API

    Offsite — The CalorieKing API provides nutritional information from the CalorieKing food database via SOAP or REST interfaces. Free up to 20,000 queries a month.
  • Capitol Words API

    Offsite — The Capitol Words API provides several methods of accessing detailed information from the Capitol Words database of word frequency from the U.S. Congressional Record. Returns results in JSON and XML.
  • Census 2000 Planning Database (for Census 2010)

    Offsite — The Tract Level Planning Database With Census 2000 Data is a database that assembles a range of housing, demographic and socioeconomic variables that are correlated with mail nonresponse. Using data from U.S. Census 2000, a database containing these variables has been developed for all census tracts in the country. The variables included in the Tract Level Planning ...
  • U.S. Census Data

    Offsite — Variety of tools for accessing U.S. Census data, including direct file access.
  • U.S. Census Bureau TIGER/Line Data

    Offsite — Data files for the U.S. Census Bureau’s experiment in web-based mapping: the Topologically Integrated Geographic Encoding and Referencing (TIGER) system. Shapefile format.
  • U.S. Census Bureau International Database

    Offsite — Demographic data and population pyramids for countries around the world, available in ASCII text.
  • 2000 U.S. Census in RDF

    Offsite — By Joshua Tauberer.
  • Center for Economic and Policy Research Data

    Offsite — provides consistent, user-friendly versions of the Survey of Income and Program Participation (SIPP), Current Population Survey (CPS), and other datasets used at CEPR.
  • Chronicling America API

    Offsite — Open programmatic access to information about historic American newspapers (1690-present) and select digitized newspaper pages (1880-1922). Returns results in Atom or Linked Data via RDF. Searching newspaper pages is also possible via OpenSearch. No API key or registration needed. Sponsored jointly by the National Endowment for the Humanities and the Library of Congress ...
  • CIA World Factbook

    Offsite — Public-domain information about the countries of the world from the U.S. government, downloadable as a compressed Zip of HTML files.
  • CiteSeer.IST

    Offsite — Scientific literature digital library and search engine with fully downloadable records.
  • CiteULike Dataset

    Offsite — Data on usage of the CiteULike service for organizing academic research papers.
  • New York City Baby Name Data

    Offsite — Two CSV files available for download: all New York City baby names dating back to 1920, and New York City baby names broken down by ethnicity, dating back to 1990. Data supplied by the New York City Department of Health and Mental Hygiene, compiled by Jennifer 8. Lee for the New York Times City Room Blog.
  • Civic Footprint API

    Offsite — Look up the political geography of any address in Cook County, Illinois.
  • Comprehensive Knowledge Archive Network (CKAN) API

    Offsite — RESTful API for querying the Comprehensive Knowledge Archive Network’s database of “[”open knowledge":]" packages and projects.
  • Climate Data Archives

    Offsite — Climate data archives.
  • The New York Times Congress API

    Offsite — Get biographical information on Congresspeople dating back to 1947 and voting records dating back to 1989 in JSON and XML. Based on information from THOMAS,, and Read the announcement on Open for more information. See also the New York Times Congress API Ruby Wrapper with Congresh Shell.
  • New York Times Congress API Ruby Wrapper with Cong

    Offsite — An easy Ruby wrapper for the New York Times Congress API. Also provides a command shell called Congresh for interacting with the API directly. Available for download under an MIT License.submitted by: Patrick Ewing
  • CorpWatch API

    Offsite — The CorpWatch API uses automated parsers to extract the subsidiary relationship information from Exhibit 21 of companies’ 10-K filings with the Securities and Exchange Commission, providing a free, well-structured interface for programs to query and process the data. Although the SEC provides a search interface for locating company filings (EDGAR / IDEA), the subsidiary ...
  • Crime Data from Ohio State University's Criminal J

    Offsite — Monthly crime data from 1960-2004 for over 17,000 U.S. police precincts.
  • CrunchBase API

    Offsite — Information on early-stage technology companies, including acquisitions and funding rounds. Available in JSON. Licensing is still being finalized but will be Creative Commons Attribution or similar.
  • Unofficial Chicago Transit Authority API

    Offsite — An unofficial, RESTful API for the Chicago Transit Authority. Get data for bus locations and routes. Results returned in XML. By Harper Reed.
  • Dartmouth Atlas of Health Care U.S. Hospital Perfo

    Offsite — U.S. hospital performance and efficiency data.
  • Airline On-Time Performance

    Offsite — Data derived from the U.S. Bureau of Transportation Statistics containing on-time arrival data for non-stop domestic flights by major air carriers. Also contains departure and arrival delays, origin and destination airports, flight numbers, scheduled and actual departure and arrival times, cancelled or diverted flights, taxi-out and taxi-in times, air time and non-stop ...
  • Doing Business: Full Data

    Offsite — Data on business regulations and their enforcement across 178 countries and selected cities. A project of the World Bank. Tables downloadable in Microsoft Excel format. See also: World Bank API and World Bank Data & Statistics.
  • DBpedia Dataset

    Offsite — A large multi-domain ontology derived from Wikipedia. GNU Free Documentation License. N3 and CSV formats.
  • Washington, D.C. Citywide Data Warehouse

    Offsite — The Washington, D.C. Citywide Data Warehouse, also known as the DC Data Catalog, is a comprehensive collection of government activity data from the District of Columbia.
  • Dolores Labs' Color Name Dataset

    Offsite — 10,000 color/label pairs, based on data collected through Amazon’s Mechanical Turk crowdsourcing marketplace. By Brendan O’Connor.
  • DIMES Project Data

    Offsite — Data from the DIMES Project, a distributed scientific research project aiming to study and map the structure and topology of the internet. CSV format.
  • Discogs API

    Offsite — RESTful API for the Discogs community-built database of music information. Artist, label and release data is made available through a public domain license.
  • New York City Sign Permit Actions

    Offsite — Applications for sign permits in New York City. Published by the New York City Department of Buildings in Microsoft Excel format.
  • USGS Earthquake Data

    Offsite — Latest earthquake data, available in CSV, XML, KML and Cube formats.

    Offsite — Web API that converts legacy/proprietary files, such as XLS, into XML in real-time so that it can be consumed by your web, mobile and desktop apps. Created out of the need to consume data from NYC Data Mine, and DataSF, which often contain datasets that are updated regularly but in proprietary formats. Check out some of the government datasets that are known to ...
  • Enron Email Mailbox PST Dataset

    Offsite — This refined version of the CALO Enron Email Dataset is available as 148 PST files, complete with original folder structure, to preserve user information associated with the emails. Licensed as Creative Commons Attribution 3.0.submitted by: John Wang