Tag

big

10 datasets
  • Big Brother Endgame

    Free Download — This dataset consists of a collection of Infoboxes from Wikipedia on the topic of Big Brother Endgame.
  • Big Four Pageants Titleholders

    Free Download — This dataset consists of a collection of Infoboxes from Wikipedia on the topic of Big Four Pageants Titleholders.
  • Big Brother Contestant

    Free Download — This dataset consists of a collection of Infoboxes from Wikipedia on the topic of Big Brother Contestant.
  • Human Genome Data Set

    Offsite — This data set contains the raw export files of the first genome sequenced by Illumina Individual Genome Service using Illumina’s Genome Analyzer technology of paired 75-base reads. 92,254,659,274 bases were used to generate a consensus sequence with coverage of 32x average depth. The genome was obtained via peripheral blood of Jay Flatley, CEO of Illumina.
  • Wikipedia Page Traffic Statistics

    Offsite — This dataset contains a 320 GB sample of the data used to power trendingtopics.org. It includes 7 months of hourly page traffic statistics for over 2.5 Million wikipedia articles (~ 1 TB uncompressed) along with the associated wikipedia content, linkgraph, & metadata. Compiled by Peter Skomoroch at Data Wrangling, LLC on May, 31, 2009 To mount the snapshot: localmachine ...
  • 3.5 Million+ US Domestic Flights from 1990 to 2009

    Free Download — Description: Over 3.5 million monthly domestic flight records from 1990 to 2009. Data are arranged as an adjacency list with metadata. Ready for immediate database import and analysis. Fields: Short name Type Description Origin String Three letter airport code of the origin airport Destination String Three letter airport code of the destination airport ...
  • Geocities Archive

    Offsite — YES THAT IS RIGHT, WE ARE RELEASING GEOCITIES ON A TORRENT. This is going to be one hell of a torrent – the compression is happening as we speak, and it’s making a machine or two very unhappy for weeks on end. The hope had been to upload it today, but the reality is this is a lot of stuff – probably 900 gigabytes will be in the torrent itself. It’s not perfect, it’s not ...
  • Google Books Ngrams

    Offsite — Description Here are the datasets backing the Google Books Ngram Viewer. These datasets were generated in July 2009; we will update these datasets as our book scanning continues, and the updated versions will have distinct and persistent version identifiers (20090715 for the current set). Each of the links will directly download a fragment of the given corpus. For ...
  • US Domestic Flights From 1990 to 2009

    Free Download — A data set of over 3.5 million monthly domestic flight records from 1990 to 2009 (edges only). This is almost two decade’s worth of unique flight data! Fields: Short name Type Description Origin String Three letter airport code of the origin airport Destination String Three letter airport code of the destination airport Passengers Integer Number ...
  • Allen Brain Atlas - complete gene expression pattern of mouse brain

    Offsite — “The Allen Brain Atlas that shows the expression pattern of almost every gene in the mouse brain, detailed in a huge series of microscopic images. This resource, which is available to everyone on the Internet, is a wonderful tool for brain researchers” (David Linden) The Allen Mouse Brain Atlas is an interactive, genome-wide image database of gene expression. Find ISH ...