Category
Showing 61 - 80 out of 716 datasets
Science
Not finding the data sets you're looking for? Not all of our data sets are categorized yet. Try checking out tags instead.
-
Biomedical Text corpora and related data collection resources.
Offsite — -
Computer hacker wordlists from packetstormsecurity.org
Offsite — -
PAIDA - Pure Python scientific analysis package
Offsite — -
Wikipedia:Lists of common misspellings/For machines - Wikipedia, the free encyclopedia
Offsite — -
Temperature data (HadCRUT3 and CRUTEM3)
Offsite — -
Word List - 250,000+ Hyphenated, Capitalized and Compound English words
Free Download — A common word list with over 250,000 entries of hyphenated, capitalized and compound English words. The download consists of entries containing more than one word, as well as capitalized words and acronyms. Phrases are considered “common” if they or variations of them occur in a standard dictionary or thesaurus. This word list is available in a simple, ... -
EMDAT - The International Emergency Disasters Database
Offsite — Description From front page: > Since 1988 the WHO Collaborating Centre for Research on the Epidemiology of Disasters (CRED) has been maintaining an Emergency Events Database EM-DAT. EM-DAT was created with the initial support of the WHO and the Belgian Government. > > The main objective of the database is to serve the purposes of humanitarian action at national and ... -
Ensembl Genome Browser
Offsite — About From website: > Ensembl is a joint project between EMBL – EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes. Ensembl is primarily funded by the Wellcome Trust. > This site provides free access to all the data and software from the Ensembl project. Click on a species name ... -
The New York Times Annotated Corpus
Offsite — From [website](http://ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2008T19): The New York Times Annotated Corpus contains over 1.8 million articles written and published by the New York Times between January 1, 1987 and June 19, 2007 with article metadata provided by the New York Times Newsroom, the New York Times Indexing Service and the online production staff ... -
True Marble Imagery
Offsite — About From website: > Unearthed Outdoors is proud to present our line of full color global imagery. Our True Marble™ imagery is some of the most realistic medium resolution imagery available on the market. At 15 meter resolution, this true color imagery can be used for: GIS & web mapping applications High Definition Television (HDTV) effects (weather, movies, ... -
Meta Package for Network Related Datasets
Offsite — Description This is a meta-package: i.e. a listing of other packages and/or material to add to CKAN. del.icio.us tag: ckan toadd network Material to Process <http://cdg.columbia.edu/uploads/datasets/celegans_raw_data> Canada Geospatial Data Infrastructure Roads inventory content – all in GML. Reference info here: <http://www.ogcnetwork.net/node/225> Listings: ... -
The collaborative, 3D encyclopedia of proteins and other molecules
Offsite — Description From the email excerpted on [Peter Murray-Rust’s blog](http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=990): > Hi Dr. Murray-Rust, > > I’m a student in Joel Sussman’s lab at the Weizmann Insitute of > Science. Joel, Jaime Prilusky and I have developed Proteopedia, a > new online tool/database with the overall goal of making structural > biology clearer for ... -
Python Cheese Shop : Shakespeare 0.4
Offsite — -
i2b2: Informatics for Integrating Biology &amp; the Bedside
Offsite — -
Word List - 350,000+ Simple English Words (with some Definitions, Excel format)
Free Download — Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. Some, but not all of the words have definitions. This list does not exclude archaic words or significant variant spellings. -
Word List - 1,000 Most Frequently Used English Words by Frequency (with Definitions, Excel format)
Free Download — This file consists of the 1,000 most frequently used English words from a wide variety of common texts listed in decreasing order of frequency -
Normal Daily Mean Temperature for Select U.S. Cities - 1971-2000
Free Download — All information is airport data except as noted and based on a standard 30-year period, 1971 through 2000. The temperature is in Fahrenheit degrees. The sources are the U.S. National Oceanic and Atmospheric Administration National Environmental Satellite Data Information Services (NESDIS) and the National Climatic Data Center (NCDC), Temperature Extremes and Drought. ... -
Cloudiness, Wind Speed, Heating/Cooling Days, and Relative Humidity for Select Cities - 1971-2000
Free Download — All information is airport data, except as noted. The data is from a period of record through 2005, except heating and cooling normals for period 1971-2000. The temperature is in Fahrenheit degrees. The source is the U.S. National Oceanic and ... -
Word List - 1000 Most Frequent Words from an Internet Corpus
Free Download — This file consists of the 1,000 most frequently used English words as used on the Internet computer network in 1992. -
NuDat - Nuclear Structure and Decay Data
Offsite — Nuclear atomic data from the National Nuclear Data Center for radioactive isotopes, including detailed decay schemes along with associated decay probabilities. Evaluated (recommended) nuclear structure and decay information for 3,175 nuclides, about 160,210 levels, 240,608 gamma-rays, etc. Obtained from ENSDF (Evaluated Nuclear Structure Data File) and Nuclear Wallet ...