Category

Showing 41 - 60 out of 716 datasets

Science

Not finding the data sets you're looking for? Not all of our data sets are categorized yet. Try checking out tags instead.
  • phishingcorpus [JoseWiki]

    Offsite
  • Web FAQ collection | ILPS

    Offsite
  • Identi.ca

    Offsite — About > Identi.ca is a microblogging service. Users post short (140 character) notices which are broadcast to their friends and fans using the Web, RSS, or instant messages. Download Bulk downloads not yet available.
  • Reference Database of Immune Cells

    Offsite — Description From home page: “RefDIC is an open-access database of quantitative mRNA/Protein profiles specifically for immune cells.” From <http://refdic.rcai.riken.jp/document.cgi>: > RefDIC is an open resource compendium of quantitative mRNA/Protein profile data specifically for immune cells. You can easily retrieve various aspects of mRNA/Protein profiles of ...
  • Climate Research Unit - CRUTEM3

    Offsite — About The datasets have been developed in conjunction with Hadley Centre of the UK Met Office. Hosted by the Climate Research Unit at the University of East Anglia. From website: > land air temperature anomalies on a 5° by 5° grid-box basis Data is from 1961 to 2008. Format Available for download in Zipped ASCII or NetCDF (see [project ...
  • Statistical Machine Translation - Europarl Parallel Corpus

    Offsite — About Overview: > The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 11 European languages: Romanic (French, Italian, Spanish, Portuguese), Germanic (English, Dutch, German, Danish, Swedish), Greek and Finnish. > The goal of the extraction and processing was to generate sentence aligned text for ...
  • Data dumps - Meta

    Offsite
  • Datasets for regression analysis, CVT basis calculations, K-means analysis, and so on.

    Offsite
  • PubChem: Information on Biological Activities of Small Molecules

    Offsite — “PubChem provides information on the biological activities of small molecules.” For license information, see: <http://www.ncbi.nlm.nih.gov/About/disclaimer.html>
  • Crystal Eye: Aggregated Crystallographic Data

    Offsite — Description The aim of the CrystalEye project is to aggregate crystallography from web resources, and to provide methods to easily browse, search, and to keep up to date with the latest published information. Openness: OPEN License: not specified (but have open data logo and authors clear intention is for data to be open). Access: ok. bulk: no. www: yes. ...
  • MOCHA-TIMIT

    Offsite — About Authors: Alan Wrench, Queen Margaret University College. Funded by: Engineering and Physical Sciences Research Council. When created: November 1999. Purpose: Phonetically balanced dataset for training an automatic speech recognition system Openness Availability: English speakers available here free for non-commercial use and may be distributed on CDROM for a ...
  • Variability Analysis of Surface Climate Observations (VASClimO)

    Offsite — About data From project webpage: > VASClimO was a joint climate research project of the Global Precipitation Climatology Centre (GPCC) at the German Met Service (DWD) and the Institute for Atmosphere and Environment – Working Group for Climatology at the Johann Wolfgang Goethe University Frankfurt. > The project was funded by the Bundesministerium für Bildung, ...
  • Climate Research Unit - HadCRUT3

    Offsite — About The datasets have been developed in conjunction with Hadley Centre of the UK Met Office. Hosted by the Climate Research Unit at the University of East Anglia. From website: > HadCRUT3 combined land and marine [sea surface temperature (SST) anomalies from HadSST2, see Rayner et al., 2006] temperature anomalies on a 5° by 5° grid-box basis Data is from 1961 to ...
  • The DGT Multilingual Translation Memory of the Acquis Communautaire

    Offsite — As of November 2007, the European Commission’s Directorate-General for Translation (DGT) made publicly accessible its multilingual Translation Memory for the Acquis Communautaire (the body of EU law) – a collection of parallel texts (texts and their translation, also referred to as bi-texts) in 22 languages. This is a page for technical users, where you will find a ...
  • Matrix Market

    Offsite — A visual repository of test data for use in comparative studies of algorithms for numerical linear algebra, featuring nearly 500 sparse matrices from a variety of applications, as well as matrix generation tools and services. Currently, 482 individual matrices and 25 matrix generators are available. The database now includes the entire Harwell-Boeing Sparse Matrix ...
  • Mushroom Data Set

    Offsite — This data set includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family (pp. 500-525). Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one. The Guide clearly states that there is no ...
  • NIST: Topic Detection and Tracking (TDT)

    Offsite — Topic Detection and Tracking research was pursued under the DARPA Translingual Information Detection, Extraction, and Summarization (TIDES) program. Topic Detection and Tracking is an integral part of the DARPA Translingual Information Detection, Extraction, and Summarization (TIDES) program. The goal of the TIDES program is to enable English-speaking users to access, ...
  • Last.fm’s Playground

    Offsite
  • ualberta dependency based thesaurus and word count data

    Offsite
  • European Climate Assessment Daily Weather Data

    Offsite