  • The Comprehensive Knowledge Archive Network (CKAN) Collection

    369 Datasets — From "their website:": bq. CKAN is the Comprehensive Knowledge Archive Network, a registry of open knowledge packages and projects (and a few closed ones)...Those familiar with freshmeat, CPAN or PyPI can think of CKAN as providing an analogous service for open knowledge...CKAN is developed and maintained by the Open Knowledge Foundation. Both the CKAN code and data are open: free for anyone to use and reuse. To find out more check out the the CKAN project at "knowledge...
  • Wikipedia Infoboxes

    3378 Datasets — From "Wikipedia": bq. An infobox on Wikipedia is a consistently formatted table which is present in articles with a common subject to provide summary information consistently between articles or improve navigation to closely related articles in that subject. (An infobox is a generalization of a taxobox (from taxonomy) which summarizes information for an organism or group of organisms.) p. Wikipedia Infoboxes are the small tables that appear on the ri...
  • Statistical Abstract of the United States

    1352 Datasets — From the "US Census bureau": bq.. The Statistical Abstract of the United States, published since 1878, is the authoritative and comprehensive summary of statistics on the social, political, and economic organization of the United States. Use the Abstract as a convenient volume for statistical reference, and as a guide to sources of more information both in print and on the Web. Sources of data include the Census Bureau, Bureau of Labor Statistics, Bur...
  • Pete Skomoroch's Bookmarks

    375 Datasets — "Pete Skomoroch": is President and Lead Consultant at Data Wrangling in Arlington, VA, a firm which specializes in mining large datasets to solve problems in search, finance, and recommendation systems. He maintains an ever-expanding (near 400 as of last count!) list of datasets which have now been incorporated into the Infochimps repository.
  • Moby Project Word Lists

    20 Datasets — The "Moby Project": has assembled some of the world's largest collections of word lists. Sixteen datasets containing common male and female first names, special words for crossword puzzles, and commonly misspelled words, and many other collections are stored in the Infochimps repository.
  • AggData

    0 Datasets — AggData sells aggregated lists of data, culled from the websites of major companies like Starbucks, Ace Hardware, &c. Their lists are geolocated and have more information on each branch of each company.
  • Twitter Census

    20 Datasets — A collection of various datasets about the online phenomenon Twitter.

    734 Datasets — The purpose of is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.
  • IP Address to US Census Data

    78 Datasets — A collection of datasets that link IP address geolocation data from MaxMind to the United States Census 2000 data.
  • Geolocation

    411 Datasets — A collection of datasets concerning the names, locations, and other information about places in the world.