The Comprehensive Knowledge Archive Network (CKAN) Collection

Showing 1 - 50 out of 369 datasets

From their website:

CKAN is the Comprehensive Knowledge Archive Network, a registry of open knowledge packages and projects (and a few closed ones)…Those familiar with freshmeat, CPAN or PyPI can think of CKAN as providing an analogous service for open knowledge…CKAN is developed and maintained by the Open Knowledge Foundation. Both the CKAN code and data are open: free for anyone to use and reuse. To find out more check out the the CKAN project at

CKAN is a peer in the global data commons and Infochimps is proud to be able to mirror their collection of over 300 datasets.

  • US Census Bureau TIGER data

    Offsite — The US government’s ‘Topologically Integrated Geographic Encoding and Referencing’ system, usually referred to as TIGER, is based on an extensive database of US geographic information. It is county-level data that documents physical features like roads and rivers, as well as some administrative features such as Congressional districts. Data can be downloaded for each ...
  • National Public Transport Access Node database (NaPTAN)

    Offsite — From the [overview]( > NaPTAN provides a unique identifier for every point of access to public transport in the UK, together with meaningful text descriptions of the stop point and its location. This enables both computerised transport systems and the general public to find and reference the stop unambiguously. Stops can be related to ...
  • National Land and Property Gazetteer

    Offsite — Description From main site: > The NLPG is the first, definitive, national address list that provides unique identification of properties across England and Wales and conforms to the British Standard, BS 7666. Local government, and potentially the public and private sectors, can link their information systems to this high-quality source of addresses and accurate ...
  • National Street Gazetteer

    Offsite — Description From the [about page]( > The National Street Gazetteer (NSG) is the definitive reference system used in the notification process and the coordination of street works. Under legislation, each local highway authority in England and Wales is required to create and maintain its own Local Street Gazetteer (LSG) ...
  • Official Journal of the European Community (OJEC)

    Offsite — Discussed at [Workshop on Public Information, 2008-11-02](
  • Chemical Block

    Offsite — About ChemBlock makes available two databases: 1. Building Blocks fields: ID number, Structure, Chemical Name, Salt data 4925 compounds 2. Screening Library fields: ID number, Structure, Salt data 122051 compounds Openness Terms of re-distribution/re-use are not mentioned on the site.
  • Open Shakespeare

    Offsite — The Open Shakespeare package provides a full open set of Shakespeare’s works along with ancillary material, a variety of tools and a python API. Specifically, in addition to the works themselves (often in multiple versions), there is an introduction, a chronology, explanatory notes, a concordance and search facilities. All material is open source/open knowledge so that ...
  • - Internet Archive

    Offsite — “The Internet Archive, a 501©(3) non-profit, is building a digital library of Internet sites and other cultural artifacts in digital form. Like a paper library, we provide free access to researchers, historians, scholars, and the general public.”
  • Binding DB - The Binding Database

    Offsite — About > BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of protein considered to be drug-targets with small, drug-like molecules. Openness Not open as restricts commercial re-use: > The database you are about to use is protected under copyright and/or patent law. While you are free to use the data ...
  • Planning Alerts Planning Applications Database

    Offsite — UK Planning Application data from a variety of councils across the UK. More information plus as full up-to-date list of councils covered can be found at: <>
  • Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network

    Offsite — About > Distributed Structure-Searchable Toxicity (DSSTox) Database Network is a project of EPA’s National Center for Computational Toxicology, helping to build a public data foundation for improved structure-activity and predictive toxicology capabilities. The DSSTox website provides a public forum for publishing downloadable, structure-searchable, standardized ...
  • ICONCLASS - Multilingual Thematic Classification

    Offsite — About From the website: > This is an experimental service that makes the ICONCLASS Iconographic Classification system available as linked-data using the SKOS vocabulary. This service is inspired by the excellent Library of Congress Subject Headings linked data service. It is intentionally copied in spirit and conventions used. The idea is to enable others to make ...
  • Discogs: Discographies

    Offsite — Discogs is a community-built database of music information. Imagine a site with discographies of all labels, all artists, all cross-referenced, this is what Discogs strives to be. Here you will find monthly data dumps of Discogs Release, Artist, and Label data. The data is in XML format and formatted according to the API spec. License All material is in the public ...
  • RDFizing and Interlinking the EuroStat Data Set Effort

    Offsite — The statistical data published on riese was originally published by Eurostat.
  • Securities &amp; Exchange Commission&#x27;s Public Information Server

    Offsite — This server features SEC public documents, information of interest to the investing public, rule-making activities, and access to the Commission’s electronic filing database, EDGAR. The public will be able to query the EDGAR database for any company currently filing electronically with the SEC. These filings are updated 24 hours after they are filed with the ...
  • The 2000 US Census: 1 Billion RDF Triples

    Offsite — 2000 U.S. Census converted into over a billion RDF triples.
  • Wordnet

    Offsite — WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts ...
  • EEA - Data service

    Offsite — About Overview: > The data service provides almost all data sets and applications which have been used in EEA’s periodical environmental reports. Topics include: Air emissions Air quality Corine land cover 1990 Corine land cover 2000 EEA owned data sets Land cover accounts Eurosion Nationally designated areas Point data Raster data Geospatial data Vector ...
  • World Values Survey

    Offsite — Description Large global surveys of ‘values’ taking place every five years since 1990 described on its website as “The world’s most comprehensive investigation of Political and Socio-Cultural Change”. Openness: Semi-Open Access: download in bulk is possible as well as analysis on the website. However have to go through terms and conditions (not automatable) ...
  • DBTune

    Offsite — “This effort has started in the context of the Linking open data community project of the Semantic Web Education and Outreach Interest Group. Its main purpose is to make available freely available data concerning music on the semantic-web, such as Magnatune, Jamendo, Dogmazic, Mutopia, and to create links between them and other available semantic web repositories, such ...
  • Correlates of War

    Offsite — Description The Correlates of War project hosts a variety of datasets related to the study of inter-state conflict. Details As of 2007-09-22 the following datasets were listed: State System Membership (v2004.1): This data set records the fluctuating composition of the state system since 1816. It also identifies countries corresponding to the standard ...

    Offsite — is a service of is a non-profit committed to publishing and sharing public domain materials in the United States. This system contains unsupported, as-is copies of selected U.S. government archives, including: The SEC’s EDGAR Database Commerce Business Daily U.S. Copyright Database Patent Full Text Database ...
  • Statistics Canada

    Offsite — About From [what we do]( page: > Statistics Canada, a member of the Industry Portfolio, produces statistics that help Canadians better understand their country—its population, resources, economy, society and culture. Access/re-use [Copyright ...
  • EUROPA - Register of Commission documents

    Offsite — About Overview > The register contains references both of documents which have already been published and of internal (unpublished) Commission documents, from the 1st January 2001. Information in register includes: the identifier or reference number, the title of the document in the languages in which it is available, the date of the document, the languages in ...
  • Places of interest in the London Borough of Sutton

    Offsite — A CSV file of places of interest in the London Borough of Sutton as compiled by Sutton Active. Currently with 142 items. The CSV file is dynamically generated from the live database with each request. Please cache locally if you require regular access. No geotags yet but I’m working on it.
  • real-time information about the global routing system from the perspectives of several different bac

    No Data — The University’s Route Views project was originally conceived as a tool for Internet operators to obtain real-time information about the global routing system from the perspectives of several different backbones and locations around the Internet. ...
  • GeoCommons

    Offsite — Description Geocommons is a website for uploading and visualizing datasets with a geospatial component (so they can be plotted on a map). Focus is on visualization rather than the data with tagline: “Explore, Create and Share Intelligent Maps and Geographic Data” Openness: PASS- License: all datasets licensed under cc by-sa 3.0 Access: data provided in kml or ...
  • Open History

    Offsite — Collection of articles – mostly about Japanese history. Started in 2001 and last updated in 2006-09-18.
  • History Commons

    Offsite — About From [about]( page: > What is the History Commons website? > The History Commons website is run by the Center for Grassroots Oversight (“CGO”), an organization that is fiscally sponsored by The Global Center, a 501©3 non-profit organization. CGO was incorporated as a public benefit corporation in late 2006, and is ...
  • Open Text Book

    Offsite — “Open Text Book is a registry of textbooks and text book material that is open in accordance with the Open Knowledge Definition (OKD).”
  • Wikispecies

    Offsite — “Wikispecies is an open, free directory of species. It covers Animalia, Plantae, Fungi, Bacteria, Archaea, Protista and all other forms of life.”
  • Wikibooks

    Offsite — “Welcome to Wikibooks, a Wikimedia project that was started on July 10, 2003 with the mission to create a free collection of open-content textbooks that anyone can edit.”
  • Wikisource

    Offsite — “Wikisource is an online library of free content publications collected and maintained by the community (see our inclusion policy).”
  • Wikimedia Commons

    Offsite — Over 2 million freely usable media files to which anyone can contribute
  • Given Name Frequency Project

    Offsite — Quite a bit of data is available for download but only individually (not in a single file). According to web page have have: > * GINAP – code to standardize given names and correct common problems in name samples. Such standardization is an important step in analysis of given names. > * Popular given names, US 1801 to 1999 – a collection of sets of standardized female ...
  • Ekopedia

    Offsite — Ekopedia is “the practical encyclopedia about alternative life techniques”. It is dedicated to providing information related to environmental sustainability. License Creative Commons
  • ISO 639-2 - Codes for the Representation of Names of Languages

    Offsite — About From home page: > ISO 639 provides two sets of language codes, one as a two-letter code set (639-1) and another as a three-letter code set (this part of ISO 639) for the representation of names of languages. ISO 639-1 was devised primarily for use in terminology, lexicography and linguistics. This part of ISO 639 represents all languages contained in ISO 639-1 ...
  • Wikinews

    Offsite — “We are a group of volunteers whose mission is to present reliable, unbiased, relevant and entertaining News. All content is released under a free license. By making our content perpetually available for free redistribution and use, we hope to contribute to a global digital commons."
  • Wiktionary

    Offsite — “Welcome to the English-language Wiktionary, a collaborative project to produce a free, multilingual dictionary with definitions, etymologies, pronunciations, sample quotations, synonyms, antonyms and translations. Wiktionary is the lexical companion to the open-content encyclopedia Wikipedia.”
  • Wikiquote

    Offsite — “Welcome to Wikiquote, a free online compendium of quotations from notable people and creative works in every language, including sources (where known), translations of non-English quotes, and links to Wikipedia for further information! The English version of Wikiquote has 13,799 pages so far with many thousands of quotations and proverbs.”
  • FreeBMD (Births, Marriages and Deaths)

    Offsite — Description From front page: “FreeBMD is an ongoing project, the aim of which is to transcribe the Civil Registration index of births, marriages and deaths for England and Wales, and to provide free Internet access to the transcribed records.” Openness: NOT OPEN 1. License: access for personal research purposes only. Full T&C below. 2. Access: single searches ...
  • Open Media Database

    Offsite — About “omdb (open media database) is a free database for film media. There is no set editorial staff, but rather a large number of movie addicts and lovers who volunteer their time to provide material and develop the site. Anybody can add or change existing information on omdb once they have done the quick and simple task of signing up for their user login name. ...
  • Fine Rolls of Henry III

    Offsite — Description From <>: > The Henry III Fine Rolls Project is a three year enterprise commencing in April 2005, funded by the Arts and Humanities Research Council. It aims to publish the Fine Rolls of Henry III from 1216 down to 1248. It is hoped that a second three year project will complete ...
  • Open-Of-Course

    Offsite — Open-Of-Course is a multilingual and interactive portal for open content courses and tutorials. It is based on the free software ELO “Moodle” and people are welcome to add their own open educational content to the system.
  • FreeDict

    Offsite — About Summary from [SourceForge page]( > Free translating dictionaries. The data is kept as XML complying to the TEI DTD. This enables to include features such as phonetics, part of speech and etymology information in a project independent format. Access/Re-use Fully open. From the [project ...
  • ChemIDplus

    Offsite — About > This database allows users to search the NLM ChemIDplus database of over 370,000 chemicals. A user may enter compound identifiers such as Chemical Name, CAS Registry Number, Molecular Formula, Classification Code, Locator Code, and Structure or Substructure. New searchable features include search and display by Toxicity indicators such as Median Lethal Dose ...
  • Ancient Geographic Information

    Offsite — Description Datasets produced by the [pleiades project]( > Organized by the Ancient World Mapping Center at the University of North Carolina at Chapel Hill, U.S.A., Pleiades brings together a global community of scholars, students and enthusiasts to expand and enhance continually the information originally brought together by ...
  • HapMap

    Offsite — Description The International HapMap Project is a partnership of scientists and funding agencies from Canada, China, Japan, Nigeria, the United Kingdom and the United States to develop a public resource that will help researchers find genes associated with human disease and response to pharmaceuticals. Datasets From ...
  • Languages of the World (Multilingual RDF Descriptions)

    Offsite — Description Linkvoj means languages in Esperanto. From the frontpage of <>: is the complete RDF file gathering currently the description of 507 languages, including all languages defined by ISO 639-1 and most of ISO 639-2 codes (a few exceptions remain, for which Wikipedia articles are not consistent with ...
  • Open Font Library

    Offsite — Openness: OPEN License: SIL OFL ( Access: yes from each page (by hand) bulk: no