6 datasets
  • Freebase Wikipedia Extraction (WEX)

    Offsite — The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia. The wiki markup for each article is transformed into machine-readable XML, and common relational features such as templates, infoboxes, categories, article sections, and redirects are extracted intabular form. Freebase WEX is provided as a set of database tables in TSV format ...
  • Infochimps Complete Data Set Catalog

    Free Download — This is a complete dump of all data and metadata publicly available through Infochimps.com.
  • downloading - flossmole - Google Code - How to get FLOSSmole data for your own use

  • metachronistic » Mirror the Wikipedia

  • DOL Form 5500 Dump

    Free Download — The Form 5500 Annual Report is the primary source of information about the operations, funding and investments of approximately 800,000 retirement and welfare benefit plans. This is a dump of all Form 5500’s available on the Department of Labor website at http://www.dol.gov/ebsa/foia/foia-5500.html. This dump is from 1999 to mid-2008.
  • Complete and Latest English Wikipedia raw dump with edit history

    Offsite — This is a direct link to the raw wikipedia data dump, roughly 7TB uncompressed. The data is bz2, gz, and 7z compressed and in .xml format. A higher level view of the data is available at this link: http://dumps.wikimedia.org/ As explained on this page: http://en.wikipedia.org/wiki/Wikipedia:Database_download, downloading data of this size uses a lot of bandwidth, which ...