-
Offsite
—
With the Article Search API, you can search New York Times articles from 1981 to today, retrieving headlines, abstracts, lead paragraphs, links to associated multimedia and other article metadata. Along with standard keyword searching, the API also offers faceted searching. The available facets include Times-specific fields such as sections, taxonomic classifiers and ...
-
Offsite
—
9/11 tragedy pager intercepts. The following are more than half a million national US pager intercepts released by wikileaks.org. This covers the September 11 tragedy from 3am on the same day (Tuesday) until 3am the following day, a 24 hour period surrounding the attacks in New York and Washington. The fields presented are: Date Time Pager-Network Pager-number ...
-
Offsite
—
From the CALO Project at Carnegie-Mellon University a massive dataset of emails recovered from discovery documents in the Enron trials About This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a ...
-
Free Download
—
A word list with over 100,000 entries that are officially permitted in crossword games like Scrabble™. This word list is available in a simple, alphabetically-ordered Excel format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom spelling dictionary. The entries include variants of words: ...
-
Free Download
—
Over 64,000 common dictionary words — A list of words in common with two or more published dictionaries. This gives the developer of a custom spelling checker a good beginning pool of relatively common words.
-
Free Download
—
U.S. place names for more than 10,000 entries. This U.S. place name list is available in a simple, alphabetically-ordered .txt format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom location tool or database. The entries represent a sampling of U.S. place names: 10,196 places in total.
-
Free Download
—
A list of over 80,000 words officially permitted in crossword games like Scrabble™ with some but not all of their definitions. The words are compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has variants of words: -ing, -ed, -s, and so on, it makes a good addition when building a custom spelling dictionary. It is an reference ...
-
Free Download
—
113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spelling dictionary.
-
Free Download
—
Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. This list does not exclude archaic words or significant variant spellings.
-
Offsite
—
About About TalkBank: > The goal of TalkBank is to foster fundamental research in the study of human and animal communication. It will construct sample databases within each of the subfields studying communication. It will use these databases to advance the development of standards and tools for creating, sharing, searching, and commenting upon primary materials via ...
-
Offsite
—
-
Offsite
—
-
Offsite
—
-
Offsite
—
-
Offsite
—
About > VoxForge was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac). > We will make available all submitted audio files under the GPL license, and then ‘compile’ them into acoustic models for use with Open Source speech recognition engines such as Sphinx, ISIP, Julius and HTK (note: HTK ...
-
Offsite
—
-
Offsite
—
-
Offsite
—
Bulk.resource.org is a service of Public.Resource.Org, the system contains unsupported, as-is copies of selected
U.S. government archives. These resources are pertaining to court information with topics like, fiches and scans, cases, courthouse news service, federal judicial center, JURIS database, request for clarification, and video proceedings.
-
Free Download
—
A data set of over a million syllabi gathered by Dan Cohen’s Syllabus Finder tool from 2002 to 2009. It could be the largest collection of syllabi ever gathered by several orders of magnitude.
See a more detailed description on Dan Cohen’s blog
Format
Data are formatted as json records separated by newlines.
Caution: this data is messy and comes with no warranty.
-
Offsite
—