Collection

Datamob

Showing 101 - 150 out of 188 datasets

Datamob aims to show, in a very simple way, how public data sources are being used.

Their listings emphasize the connection between data posted by governments and public institutions and the interfaces people are building to explore that data.

  • Mobile Commons Legislative Lookup API

    Offsite — Database that matches latitude and longtitude with the U.S. congressional and state legislators for that location. Available as an API and as a standalone Rails application for download under an MIT License. By Mobile Commons.
  • MediaWiki API

    Offsite — Programmatically access websites running on MediaWiki wiki software, including Wikimedia Foundation sites like Wikipedia, which are licensed under the GNU Free Documentation License. In active development.
  • Metafilter Infodump

    Offsite — Collection of data culled from the Metafilter community weblog database: stats on Metafilter posts, comments, tags, favorites and users. ASCII text.
  • Official Major League & Minor League Baseball Data

    Offsite — Official source of current major league and minor league baseball statistics, in XML form.
  • MONDIAL database

    Offsite — Geographical database compiled from the CIA World Factbook and other sources. RDF, XML and other formats.
  • USGS Mineral Resources Data

    Offsite — Collection of downloadable datasets on mineral resources in the U.S.
  • MusicBrainz

    Offsite — Community music metadatabase.
  • National Bureau of Economic Research

    Offsite — Large collection of datasets on the American economy.
  • National Bridge Inventory Data

    Offsite — Data on 600,000 U.S. bridges. ASCII text.
  • National Climatic Data Center Data Directory

    Offsite — World’s largest active archive of weather data. Not all the data is free if you’re not in school, in the military or working for the government. ASCII text.
  • Integrated Postsecondary Education Data System (IPEDS)

    Offsite — Download zipped CSVs of admission and enrollment information about all U.S. colleges, including cost of attendance and room and board charges; selectivity, average SAT/ACT scores, and other admissions considerations; enrollment, retention, and graduation rates by gender, ethnicity, and age; degrees conferred by program and award level; number of students receiving ...
  • National Center for Health Statistics Data Warehou

    Offsite — Downloadable data on U.S. health topics including vital statistics, aging, immunization, health care and nutrition.
  • UK Neighbourhood Statistics API

    Offsite — SOAP-based API from the UK Office of National Statistics for accessing data about UK neighborhoods. Includes data on the 2001 Census, Access to Services, Community Well-Being/Social Environment, Crime and Safety, Economic Deprivation, Education, Skills and Training, Health and Care, Housing, Indicators, Indices of Deprivation, People and Society, Physical Environment and ...
  • Netflix Prize

    Offsite — “The Netflix Prize seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on their movie preferences.” Register to download the dataset.
  • Network Datasets Compiled by Mark Newman

    Offsite — A collection of network datasets drawn from studies of human social networks, dolphin social networks, works of literature, power grids, books, blogs and more. Compiled by Mark Newman, professor of physics at the University of Michigan.
  • Neighborhood Knowledge Los Angeles (NKLA) Data

    Offsite — Data on tax-delinquent properties and demographic information in Los Angeles, offered to help prevent housing conditions from deteriorating.
  • NPR API

    Offsite — Query an archive of National Public Radio content dating back to 1995. Returns results in RSS, MediaRSS, JSON, Atom and custom NPRML formats. See the announcement on NPR.org for more information.
  • New York City Property Sales

    Offsite — Property sales reported by the New York City Department of Finance. Data is updated monthly and published in Microsoft Excel format.
  • New York City Subway Ridership, 1905-2006

    Offsite — Download a spreadsheet containing the annual registrations, or recorded entries, at each station in the New York City Metropolitan Transit Authority subway system from 1905-2006. Posted by Michael Frumin.
  • The New York Times Article Search API

    Offsite — The NYT Article Search API provides searchable access to nearly three million New York Times articles from 1981 to the present day. Results returned in JSON.
  • The New York Times Campaign Finance API

    Offsite — Retrieve political campaign contribution and expenditure data based on United States Federal Election Commission filings. Data available in JSON, XML or serialized PHP. Registration required. See the announcement on Open for more information.
  • New Zealand Hansard

    Offsite — Text of New Zealand parliamentary debates. No developer-friendly download options; HTML parsing required.
  • New Zealand Legislation

    Offsite — Text of New Zealand Acts, Bills and Regulations. Their RSS feed can be put to good use. submitted by: Rob McKinnon
  • OECD.Stat

    Offsite — Search for and extract data from across the Organisation for Economic Cooperation and Development’s databases.
  • OpenCongress API

    Offsite — Programmatic access to all the data on OpenCongress, from official bill information to user-generated votes on bills. API key required.
  • OpenCyc API

    Offsite — Programmatic access to the open source version of the Cyc Knowledge Base, the world’s largest and most complete general knowledge base and commonsense reasoning engine. The Cyc Knowledge Base is a formalized representation of a vast quantity of fundamental human knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and events of everyday ...
  • Open Economics Data Store

    Offsite — Collection of datasets on current and historical topics in economics. CSV format.
  • The New York State Senate Open Legislation Service

    Offsite — Browse, search and download legislative information from the New York State Senate. Structured legislative information available in XML, CSV and JSON formats. Licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
  • Open Library API

    Offsite — Query the Open Library database, the goal of which is to provide one web page for every book ever published.
  • Open Secrets

    Offsite — Data on money in U.S. politics, collected and published by the Center for Responsive Politics. See also: MAPLight.org: Special Interest Categories and Bill Research
  • Oscar-nominated Movie Piracy Data

    Offsite — Piracy data on all 186 Oscar-nominated films from 2003 to 2008. Compiled and published by Andy Baio.
  • outside.in API

    Offsite — The outside.in API provides news articles, blog posts, tweets and more within 1,000 feet of any latitude and longitude in the United States in XML or JSON format. Licensed under a simple Terms of Service.
  • Parliament Parser

    Offsite — Structured data about Members of British Parliament.
  • Party Time Data Pack

    Offsite — Download a CSV of the data powering Party Time, the Sunlight Foundation’s political partying database which tracks parties for members of Congress and congressional candidates.
  • The Public Whip Data: British MP Voting Records

    Offsite — Data for British MP votes for each division, attendance and rebelliousness rates. In .txt, .dat, XML and text dumps of MySQL tables.
  • QuotationsBook Database

    Offsite — 40,000+ quotations downloadable in XML format and available under a Creative Commons license.
  • Who Is My Representative API

    Offsite — Call Who Is My Representative PHP scripts directly with ZIP code information and have XML or JSON returned with information on the corresponding representation in U.S. Congress.
  • U.S. Copyright Registration Records

    Offsite — Raw data from the U.S. Copyright Office of the Library of Congress.
  • U.S. Patents, 1983-2000

    Offsite — tar.gz collections of the full text of U.S. Patents from 1983 to 2000.
  • U.S. Trademark Data

    Offsite — Trademark data from the U.S. Patent and Trademark Office.
  • SEC Filings (EDGAR)

    Offsite — Access to U.S. Securities and Exchange Commission filings from public companies.
  • SEC Corporate Ownership Linked Data, 2003-2006

    Offsite — This is a semantic web, RDF, linked-data, and SPARQL interface to U.S. corporate ownership information derived from filings to the U.S. Securities and Exchange Commission in its EDGAR database. There are three parts to this database: Part I: Individual Ownership via SEC forms 3, 4, 5, Part II: Subsidiary Information via 10-K Filings via CorpWatch, and Part III: Links to ...
  • UMass Amherst Linguistics Sentiment Corpora

    Offsite — N-gram counts extracted from over 700,000 online product reviews in Chinese, English, German and Japanese. Formatted to be read as R data frames. By Noah Constant, Christopher Davis, Christopher Potts and Florian Schwarz. Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.
  • San Francisco Building Permit Activity

    Offsite — Building permits issued or applied for as reported by the San Francisco Department of Building Inspection (DBI), published in Microsoft Excel format.
  • Social Actions API

    Offsite — Social Actions is an open source database of actions people can take on any issue, from volunteer opportunities to micro credit loans. The Social Actions RESTful API returns data in JSON format and contains up-to-date information about actions from 50+ sources, including Care2, DonorsChoose.org, Idealist and Kiva.Creative Commons Attribution-Noncommercial-Share Alike 2.5 ...
  • IRS Income Tax Data

    Offsite — ZIP Code Area tables for tax years 2002, 2004 and 2005 from the U.S. Internal Revenue Service’s Statistics of Income (SOI) program, which uses the Federal tax system as a comprehensive source of economic and financial information. Microsoft Excel format. Public domain.
  • Splog Blog Dataset

    Offsite — Dataset of 3,000 blog homepages, of which 700 have been labeled as spam-blogs or splogs and another 700 as authentic blogs.
  • Stanford Copyright Renewal Database

    Offsite — Downloadable dataset of 250,000 records on U.S. copyright renewal for books published between 1950 and 1995.
  • Star Wars Kid Data Dump

    Offsite — Data from Waxy.org’s server logs concerning the initial spread of the Star Wars Kid viral video in 2003. Public domain.
  • Statistical Abstract of the United States

    Offsite — The U.S. Census Bureau’s comprehensive summary of statistics on the social, political and economic organization of the United States. Many tables downloadable in Microsoft Excel format.