Tag

bigdata

Showing 61 - 77 out of 77 datasets
  • 1990 US Census

    Offsite — Data from the 1990 US Census from the US Census Bureau
  • 1980 US Census

    Offsite — Data from the 1980 US Census from the US Census Bureau
  • 1980 US Census

    Offsite — Data from the 1980 US Census from the US Census Bureau
  • Federal Contracts from the Federal Procurement Data Center (USASpending.gov)

    Offsite — This data set is a dump from the Federal Procurement Data Center (FPDC), which manages the Federal Procurement Data System (FPDS-NG). FPDS-NG collects and disseminates procurement data – or information about contracts that the federal government gives to private companies. The FPDS-NG summarizes who bought what, from whom, and where. See http://www.fpds.gov/. For a ...
  • University of Florida Sparse Matrix Collection

    Offsite — These matrices cover a wide spectrum of domains, include those arising from problems with underlying 2D or 3D geometry (such as structural engineering, computational fluid dynamics, model reduction, electromagnetics, semiconductor devices, thermodynamics, materials, acoustics, computer graphics/vision, robotics/kinematics, and other discretizations) and those that ...
  • DBPedia

    Offsite — ,DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. The DBpedia knowledge base currently describes more than 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies. The knowledge base consists of 274 million pieces of ...
  • Wikipedia Extraction (WEX)

    Offsite — The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia. The wiki markup for each article is transformed into machine-readable XML, and common relational features such as templates, infoboxes, categories, article sections, and redirects are extracted intabular form. Freebase WEX is provided as a set of database tables in TSV format ...
  • 3D Version of the PubChem Library

    Offsite — This data set is a 3D Version of the PubChem Library. PubChem provides information on the biological activities of small molecules. It is a component of NIH’s Molecular Libraries Roadmap Initiative.
  • PubChem Library

    Offsite — PubChem provides information on the biological activities of small molecules. It is a component of NIH’s Molecular Libraries Roadmap Initiative.
  • UGI Virtual Conformer Library

    Offsite — Data in SD format on conformers for 500,000 molecules that can be used for virtual screening.
  • GenBank

    Offsite — GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences (Nucleic Acids Research, 2008 Jan;36(Database issue):D25-30). There are approximately 85,759,586,764 bases in 82,853,685 sequence records in the traditional GenBank divisions and 108,635,736,141 bases in 27,439,206 sequence records in the WGS division as of ...
  • Unigene

    Offsite — Each UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
  • Ensembl Annotated Human Genome Data - for MySQL

    Offsite — This data set provides scientists with the opportunity to research and understand this important area of biology. These snapshots includes all the databases that are available at http://www.ensembl.org, as well as the Ensembl Biomart, which is a denormalized, query-optimized database that facilitates complex queries of one or more datasets. Full installation instructions ...
  • AnthroKids - Anthropometric Data of Children

    Offsite — This data set includes the results of two studies which collected anthropometric data of children. The studies, conducted in 1975 and 1977 are available in a number of different formats. These studies were the result of a Consumer Product Safety Commission (CPSC) effort in the mid-seventies. The creation of a publically accessible database is the result of a joint effort ...
  • Influenza Virus (including updated Swine Flu sequences)

    Offsite — This data set includes database and sequence data from the NIAID Influenza Genome Sequencing Project and Genbank. For more information on this data set refer to the NCBI Influenza Virus Resource *Update: This data set is being updated regularly to include new sequences of swine influenza A (H1N1) submitted by the Center for Disease Control and Prevention (CDC).
  • USPTO Bulk Downloads

    Offsite — United States Patent and Trademark Office Bulk Downloads Google and the USPTO have entered into an agreement to make the following USPTO products available to the public at no charge:Patents (grants, applications, assignments, classification information, and maintenance fee events) Trademarks (grants, applications, assignments, and TTAB proceedings) All data originated ...
  • Google Books Ngrams

    Offsite — Description Here are the datasets backing the Google Books Ngram Viewer. These datasets were generated in July 2009; we will update these datasets as our book scanning continues, and the updated versions will have distinct and persistent version identifiers (20090715 for the current set). Each of the links will directly download a fragment of the given corpus. For ...