Tag

lexicon

11 datasets
  • Word List - 100,000 + Official Crossword Words (Excel readable)

    Free Download — A word list with over 100,000 entries that are officially permitted in crossword games like Scrabble™. This word list is available in a simple, alphabetically-ordered Excel format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom spelling dictionary. The entries include variants of words: ...
  • Word List 80,000+ Official Crossword Words (with most definitions, Excel format)

    Free Download — A list of over 80,000 words officially permitted in crossword games like Scrabble™ with some but not all of their definitions. The words are compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has variants of words: -ing, -ed, -s, and so on, it makes a good addition when building a custom spelling dictionary. It is an reference ...
  • Word List - 100,000+ official crossword words (Excel readable)

    Free Download — 113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spelling dictionary.
  • Wordnet

    Offsite — WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts ...
  • Word List - 1000 Most Frequent Words from an Internet Corpus

    Free Download — This file consists of the 1,000 most frequently used English words as used on the Internet computer network in 1992.
  • Word List - 1,000 Most Frequently Used English Words by Frequency (with Definitions, Excel format)

    Free Download — This file consists of the 1,000 most frequently used English words from a wide variety of common texts listed in decreasing order of frequency
  • Linguistic Data Consortium (LDC) - Collection of Linguistic Corpora and Datasets

    Offsite — The Linguistic Data Consortium is an open consortium of universities, companies and government research laboratories. It creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes. The University of Pennsylvania is the LDC’s host institution. The LDC was founded in 1992 with a grant from the Advanced ...
  • Word List - 1,000+ Most Frequent words in King James Bible

    Free Download — 1,185 King James Version frequent substrings (KJVfreq.txt) The most frequently occurring 1,185 substrings in the King James Version Bible ranked and counted by order of frequency.
  • Word List - Official Scrabble (TM) Player's Dictionary (OSPD) 2nd ed (with Definitions, Excel format

    Free Download — 4,160 official crosswords delta (crswd-d.txt) When combined with the 113,809 crosswords file, it produces the official crossword list compatible with the second edition of the Official Scrabble Players Dictionary. (Scrabble is a registered ...
  • Word List - Official Scrabble (TM) Player's Dictionary (OSPD) 2nd ed

    Free Download — 4,160 official crosswords delta (crswd-d.txt) When combined with the 113,809 crosswords file, it produces the official crossword list compatible with the second edition of the Official Scrabble Players Dictionary. (Scrabble is a registered trademark of Milton-Bradley licensed to Merriam-Webster.)
  • Google Labs - Books Ngram Viewer

    Offsite — Here are the datasets backing the Google Books Ngram Viewer. These datasets were generated in July 2009; we will update these datasets as our book scanning continues, and the updated versions will have distinct and persistent version identifiers (20090715 for the current set). Each of the links below will directly download a fragment of the given corpus. For instance, ...