7 datasets
  • Death Rates for Human Immunodeficiency Virus

    Free Download — The Statistical Abstract files are distributed by the US Census Department as Microsoft Excel files. These files have data mixed with notes and references, multiple tables per sheet, and, worst of all, the table headers are not easily matched to their rows and columns. A few files had extraneous characters in the title. These were corrected to be consistent. A few files ...
  • Machine Learning (Theory) » The Peekaboom Dataset

  • Growth Charts - US Children Birth to 36 months

    Offsite — For US Boys and Girls, percentile charts for Length-for-age Weight-for-age Head circumference-for-age Weight-for-length Stature-for-age Weight-for-age Body mass index-for-age
  • Human Genome Data Set

    Offsite — This data set contains the raw export files of the first genome sequenced by Illumina Individual Genome Service using Illumina’s Genome Analyzer technology of paired 75-base reads. 92,254,659,274 bases were used to generate a consensus sequence with coverage of 32x average depth. The genome was obtained via peripheral blood of Jay Flatley, CEO of Illumina.
  • YRI Trio Dataset

    Offsite — The YRI Trio Dataset provides complete genome sequence data for three Yoruba individuals from Ibadan, Nigeria, which represent the first human genomes sequenced using Illumina’s next generation Sequence-by-Synthesis technology. For each genome, the dataset contains >30x average depth of paired 35-base reads. This data set can be used for the following applications: The ...
  • Allen Brain Atlas - complete gene expression pattern of mouse brain

    Offsite — “The Allen Brain Atlas that shows the expression pattern of almost every gene in the mouse brain, detailed in a huge series of microscopic images. This resource, which is available to everyone on the Internet, is a wonderful tool for brain researchers” (David Linden) The Allen Mouse Brain Atlas is an interactive, genome-wide image database of gene expression. Find ISH ...
  • 1000 Genomes Data

    Offsite — The 1000 Genomes data is an open dataset from the biological research community containing genetic sequencing data. The complete dataset is huge, at roughly 150TB uncompressed.