Collection

Datamob

Showing 51 - 100 out of 188 datasets

Datamob aims to show, in a very simple way, how public data sources are being used.

Their listings emphasize the connection between data posted by governments and public institutions and the interfaces people are building to explore that data.

  • Enron Email Dataset

    Offsite — 200,000 internal emails from Enron, 1999-2002, made public in 2003 as part of the US Federal Energy Regulatory Commission’s investigation into Enron. 400 MB, compressed. See also: Enron Email Mailbox PST Dataset
  • Ensembl Genome Data

    Offsite — The Ensembl project produces genome databases for vertebrates and other eukaryotic species, and makes this information freely available online. Data can be downloaded in a variety of formats, from flat files to MySQL dumps. Freely available for use under an Apache-style license.
  • National Center for Education Statistics

    Offsite — Tables and figures on education in the U.S., including state-specific tables. Downloadable in Microsoft Excel format.
  • Etsy API

    Offsite — RESTful API for programmatically accessing Etsy, a global handmade marketplace. Provides access to data on users, shops, item listings, feedback, tags and categories, favorites and gift guides. Returns results in JSON. API key required. Review the Etsy API Terms of Use.
  • EuroStat

    Offsite — Large variety of datasets about the European Union.
  • Federal Aviation Administration Data & Statistics

    Offsite — U.S. aviation data, downloadable in ASCII text and Microsoft Excel formats.
  • FARAdb: Foreign Agent Registration Act Database

    Offsite — The Foreign Agent Registration Act requires the U.S. Justice Department to maintain detailed reports of lobbying firms hired by foreign governments and by organizations controlled by foreign governments. FARAdb is a database of these reports, searchable by lobbying firm, client or member of Congress’s office contacted. Data downloadable in CSV format. A project of ...
  • U.S. Federal Election Commission Campaign Finance

    Offsite — U.S. political campaign finance data, downloadable in comma-delimited ASCII text. See also the Maplight API and New York Times Campaign Finance API.
  • The Federal Reserve Board Statistics

    Offsite — Releases and historical data from the U.S. Federal Reserve System. Most tables available in tab-delimited text and/or CSV formats.
  • FedSpending.org

    Offsite — Access to U.S. government spending information. Data can be downloaded in comma-delimited ASCII text, tab-delimited ASCII text and XML, and there’s an API as well.
  • FedStats

    Offsite — Comprehensive source of U.S. statistical data.
  • Follow the Money API

    Offsite — Web framework for accessing the National Institute on Money in State Politics’ database of all campaign contributions to political campaigns at the state level. See also: MAPLight.org: Special Interest Categories and Bill Research
  • Federal Reserve Economic Data API

    Offsite — RESTful API for retrieving economic data from the Federal Reserve Economic Data (FRED) and ArchivaL Federal Reserve Economic Data (ALFRED) websites hosted by the Economic Research Division of the Federal Reserve Bank of St. Louis. Requests can be customized according to data source, release, category, series and other preferences. API key required.
  • Freebase API

    Offsite — Query the Freebase shared database of the world’s knowledge via a JSON-based query language. See also: Freebase Data Dumps and Freebase Acre
  • Freebase Data Dumps

    Offsite — Full data dumps of the Freebase shared database of the world’s knowledge. Available in tab-separated values format and a low-level link export suitable for converting into RDF or XML. Creative Commons Attribution license. See also: Freebase API and Freebase Wikipedia Extraction (WEX).
  • Freeway Traffic Analysis: I-880 Database

    Offsite — Data on traffic flow on the I-880 freeway in California.
  • Failure Trace Archive

    Offsite — The Failure Trace Archive (FTA) is centralized public repository of availability traces of parallel and distributed systems, and tools for their analysis. The purpose of this archive is to facilitate the design, validation and comparison of fault-tolerant models and algorithms.submitted by: Jeremy Cowles
  • FUTEF Wikipedia API

    Offsite — Search API for accessing Wikipedia content. Available for non-commercial use.
  • Geographically Based Economic Data (G-Econ)

    Offsite — Geophysically scaled economic dataset, containing economic activity for each 1×1 degree latitude-by-longitude cell on the globe. By Professor William Nordhaus of Yale University. Microsoft Excel format. Public domain.
  • GeoNames Geographical Database

    Offsite — Comprehensive geographical name database. Creative Commons Attribution 3.0 license. Tab-delimited text format.
  • GetSatisfaction API

    Offsite — This API provides access to all the questions, answers and ideas exchanged between companies and their customers on GetSatisfaction. Data available in JSON, Atom and XHTML.
  • Songs Sampled in Girl Talk's 'Feed the Animals'

    Offsite — Information about the 264 samples used in Girl Talk’s Feed the Animals album, along with a few visualizations. CSV format. By Andy Baio.
  • Given Name Frequency Project

    Offsite — Four datasets on given name popularity.
  • Google Flu Trends

    Offsite — Download Google’s weekly influenza-like illness (ILI) estimates for the United States in plain text or CSV format. Data is provided for each individual state, the nine influenza surveillance regions, and the entire United States.
  • U.S. Copyright Renewal Records

    Offsite — U.S. copyright renewal records, downloadable as a single XML file. By Google software engineer Jarkko Hietaniemi. Public domain.
  • GovTrack Government Activity Data

    Offsite — A directory of the datasets that power GovTrack.us. RDF, CSV and XML formats. Learn more about the GovTrack backend.
  • U.S. Department of Housing and Urban Development D

    Offsite — Downloadable datasets from the U.S. Department of Housing and Urban Development’s Policy Development and Research office, including data on fair market rents, the American Housing Survey, USPS vacancies and Government Sponsored Enterprises (GSE). Microsoft Excel format.
  • Hunch API

    Offsite — RESTful API for programmatically accessing Hunch, a questions and answers service that harnesses collective knowledge to offer solutions to user-entered problems. Hunch is designed so that every time it’s used, it learns something new. Query for questions, responses, topics, search results and categories as well as statistics pertaining to THAY (Teach Hunch About You) ...
  • IllinoisTrack.us: Raw Data

    Offsite — The data powering IllinoisTrack.us, harvested from the Illinois General Assembly website, made available in XML. By Josh Sulkin.
  • IMF Data and Statistics

    Offsite — Wide range of time series data on International Monetary Fund lending, exchange rates and other economic and financial indicators. Tab-delimited values format.
  • UCI Machine Learning Repository

    Offsite — Repository of data sets for machine learning research.
  • Frequent Itemset Mining Dataset Repository

    Offsite — Anonymized clickstream data from a Hungarian news portal, anonymized retail market basket data from an anonymous Belgian retail store, anonymized traffic accident data and more.
  • UCI KDD Archive

    Offsite — Repository of large data sets for knowledge discovery research. See also: UCI Machine Learning Repository.
  • Kiva API

    Offsite — Set of RESTful web services for fetching public data from the Kiva lending community. JSON and XML formats. No developer registration required. See also: Social Actions API
  • Refugee Flow Patterns in Kosovo, March-May 1999

    Offsite — A collection of datasets on the flow of refugees from Kosovo during the 1999 war between NATO and Yugoslavia.
  • LaborSta

    Offsite — Datasets from the International Labour Organization. CSV format.
  • Law Commons

    Offsite — Source files and information on AltLaw.org, the full-text searchable database of U.S. Supreme Court and Federal Appellate case reports. Case reports are downloadable via FTP.
  • Legislative Council of California Bill Index

    Offsite — California legislative data, available in ASCII text.
  • Human Rights Data Analysis Group: Sierra Leone Tru

    Offsite — Data on human rights violations in Sierra Leone.
  • LibraryThing Web Services API

    Offsite — RESTful XML-based API for querying the LibraryThing Common Knowledge database of interesting facts about books. Developer key required. Creative Commons Attribution-Share Alike 3.0 license. See the announcement on LibraryThing for more information. See also the LibraryThing Books API.
  • Libre Map Project

    Offsite — Free U.S. digital maps and geographic information system (GIS) data. Creative Commons Attribution Share-Alike 2.0 license.
  • LittleSis API

    Offsite — The LittleSis API exposes the raw data used on the LittleSis website, and consists of basic information about powerful individuals and organizations (“entities”), and the relationships between them. Returns results in XML or JSON. API key required.
  • U.S. Lobbying Databases

    Offsite — Lobbyist information, made available by the U.S. Senate Office of Public Records in accordance with the Lobbying Disclosure Act (LDA). XML format.
  • Notices from the London Gazette

    Offsite — All notices published in the London Gazette, the UK government’s official journal and newspaper of record, in XML from February 2007 to May 2008.
  • Transport for London API

    Offsite — Programmatically access information on all forms of public transport in London. The ‘Tube this weekend’ feed contains information on planned line and station closures for the coming weekend. The ’Station location’ feed is a geo-coded KML feed of most London Underground, Docklands Light Railway and London Overground stations. The ‘Findaride’ KML feed contains ...
  • LOUIS API

    Offsite — Web framework for accessing the 300,000 U.S. federal government documents that comprise the LOUIS database. The documents are scraped daily from the Government Printing Office website, GPOAccess.gov.
  • MAPLight Candidates Financial Summary API

    Offsite — Query U.S. House and Senate candidate funding data from MAPLight.org. See also: MAPLight.org: Special Interest Categories and Bill Research.
  • MAPLight Bill Positions API

    Offsite — Query MAPLight.org’s original research on supporting and opposing interests for U.S. legislative bills. Returns results in JSON or XML. See also: MAPLight Candidates Financial Summary API and MAPLight.org: Special Interest Categories and Bill Research.
  • MAPLight.org: Special Interest Categories and Bill

    Offsite — The 400+ interest categories used by MAPLight.org, the National Institute on Money in State Politics (Follow the Money) and the Center for Responsive Politics (Open Secrets) to categorize political campaign contributions. Bill research used by MAPLight. Originally contributed by MAPLight to Watchdog.net. Microsoft Excel and CSV formats. See also: MAPLight API.
  • MassGIS

    Offsite — Geographic and environmental data available for download from the Commonwealth of Massachusetts’s Geographic Information System. Base map data includes roads, topographic features, and political boundaries. Other data includes crime and demographic data sliced geographically. MassGIS distributes vector data in Shapefile format and raster (image) data in TIFF, MrSID, JPEG ...