Category: Science » Natural Science
Not finding the data sets you're looking for? Not all of our data sets are categorized yet. Try checking out tags instead.
15 datasets
This dataset consists of a collection of Infoboxes from Wikipedia on the topic of Taxobox. Snippet: Antilles_pinktoe: name: Antilles Pink Toed Tarantula regnum: "[[Animal]]" classis: "[[Arachnid]]" phylum: "[[Arthropod]]" ordo: "[[Spider]]" imageWidth: 250px imageCaption: Female Avicularia versicolor binomial: Avicularia versicolor familia: ...
Free
Description As of August 2008 over 52 thousand structures available for download. From home page: > The RCSB PDB provides a variety of tools and resources for studying the structures of biological macromolecules and their relationships to sequence, function, and disease. > > The RCSB is a member of the wwPDB whose mission is to ensure that the PDB archive remains ...
Offsite
This data set includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family (pp. 500-525). Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one. The Guide clearly states that there is no ...
Offsite
There are many different drug targets expressed in brain tissue. The delivery of drugs to central nervous system (CNS) is complicated by blood brain barrier (BBB) which controls entry of drugs into CNS. BBB is one of the most important factors limiting the development of drugs that specifically target brain disorders. Nowadays BBB remains a bottleneck in brain drug ...
Free
AceDB is a genome database system developed since 1989 primarily by Jean Thierry-Mieg (CNRS, Montpellier) and Richard Durbin (Sanger Institute). It provides a custom database kernel, with a non-standard data model designed specifically for handling scientific data flexibly, and a graphical user interface with many specific displays and tools for genomic data. AceDB is ...
Offsite
This data set contains the raw export files of the first genome sequenced by Illumina Individual Genome Service using Illumina’s Genome Analyzer technology of paired 75-base reads. 92,254,659,274 bases were used to generate a consensus sequence with coverage of 32x average depth. The genome was obtained via peripheral blood of Jay Flatley, CEO of Illumina.
Offsite
The YRI Trio Dataset provides complete genome sequence data for three Yoruba individuals from Ibadan, Nigeria, which represent the first human genomes sequenced using Illumina’s next generation Sequence-by-Synthesis technology. For each genome, the dataset contains >30x average depth of paired 35-base reads. This data set can be used for the following applications: The ...
Offsite
FASTA database files are sequence databases of transcript and translation models predicted by the Ensembl analysis and annotation pipeline, as well as by ab initio methods.
Read more about the FASTA format.
Offsite
GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences (Nucleic Acids Research, 2008 Jan;36(Database issue):D25-30). There are approximately 85,759,586,764 bases in 82,853,685 sequence records in the traditional GenBank divisions and 108,635,736,141 bases in 27,439,206 sequence records in the WGS division as of ...
Offsite
Each UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
Offsite
This data set provides scientists with the opportunity to research and understand this important area of biology. These snapshots includes all the databases that are available at http://www.ensembl.org, as well as the Ensembl Biomart, which is a denormalized, query-optimized database that facilitates complex queries of one or more datasets. Full installation instructions ...
Offsite
This data set includes the results of two studies which collected anthropometric data of children. The studies, conducted in 1975 and 1977 are available in a number of different formats. These studies were the result of a Consumer Product Safety Commission (CPSC) effort in the mid-seventies. The creation of a publically accessible database is the result of a joint effort ...
Offsite
This data set includes database and sequence data from the NIAID Influenza Genome Sequencing Project and Genbank. For more information on this data set refer to the NCBI Influenza Virus Resource
*Update: This data set is being updated regularly to include new sequences of swine influenza A (H1N1) submitted by the Center for Disease Control and Prevention (CDC).
Offsite