Tag

words

Showing 1 - 20 out of 35 datasets
  • Twitter Census - Conversation Metrics: One year of URLs, Hashtags, Smileys usage (monthly)

    Free Download — Twitter data from millions of tweets! This is a download of Twitter data from March 2006 to November 2009. The data set consists of “tokens,” which are hashtags (#data), URLs, or emoticons (Twitter smileys or other “faces” created using keyboard characters). The data comes from analysis on the full set of tweets during that time period, which is 35 million users, over ...
  • Twitter Census - Conversation Metrics: One Year of URLs, Hashtags, Smileys Usage (by Hour)

    Free Download — Twitter data from millions of tweets! This is a download of Twitter data from March 2006 to November 2009. The data set consists of “tokens,” which are hashtags (#data), URLs, or emoticons (Twitter smileys or other “faces” created using keyboard characters). The data comes from analysis on the full set of tweets during that time period, which is 40 million users, 1.6 ...
  • Twitter Census - Conversation Metrics: One year of URLs, Hashtags, Smileys usage (Smiley Counts)

    Free Download — Twitter smiley data from millions of tweets! This is a free download of Twitter data from March 2006 to November 2009. The smiley data comes from analysis on the full set of tweets during that time period, which is 35 million users, over 500 ...
  • Word List - 100,000 + Official Crossword Words (Excel readable)

    Free Download — A word list with over 100,000 entries that are officially permitted in crossword games like Scrabble™. This word list is available in a simple, alphabetically-ordered Excel format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom spelling dictionary. The entries include variants of words: ...
  • Word List of 64,000+ Common English Dictionary Words (most with definitions, Excel format)

    Free Download — Over 64,000 common dictionary words — A list of words in common with two or more published dictionaries. This gives the developer of a custom spelling checker a good beginning pool of relatively common words.
  • Word List - 10,000+ Common Place Names

    Free Download — U.S. place names for more than 10,000 entries. This U.S. place name list is available in a simple, alphabetically-ordered .txt format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom location tool or database. The entries represent a sampling of U.S. place names: 10,196 places in total.
  • Word List - 100,000+ official crossword words (Excel readable)

    Free Download — 113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spelling dictionary.
  • Wordnet

    Offsite — WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts ...
  • A list of all 22,802 words in the Scribblenauts dictionary.

    Free Download — List of summonable objects from the Nintendo DS game Scribblenauts, from AARDVARK, ABOMINABLE SNOWMAN and ABSCONDER to ZOMBIE, ZUNICERATOPS and ZYGOTE. via the Scribblenauts Wikipedia entry: Scribblenauts is an emergent puzzle action video game with the tagline “Write Anything, Solve Everything”. Its objective is to complete puzzles by summonning any object (from a ...
  • Word List - 350,000+ Simple English Words (Excel readable)

    Free Download — Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. This list does not exclude archaic words or significant variant spellings.
  • AOL Search Data

    Free Download — The AOL Search Data is a collection of real query log data that is based on real users. The data set consists of 20M web queries collected from 650k users over three months. These private searches are perfect for research and mining. The data is sorted by anonymous user ID and sequentially arranged. The collection can be used for personalization, query reformulation or ...
  • List of Dirty, Obscene, Banned and otherwise unacceptable words

    Free Download — A banned word list representing a collection of many lists from around the web of words considered socially unacceptable for one reason or another. What to do with a banned word list? Use this dirty word list to screen for spammers and griefers, to censor dissidents; to better understand the semiotic role of taboo signifiers in an online modality; to monitor user ...
  • 80 Million Tiny Images

    Offsite — Visual dictionary presents a visualization of all the nouns in the English language arranged by semantic meaning. Each of the tiles in the mosaic is an arithmetic average of images relating to one of 53,464 nouns. The images for each word were obtained using Google’s Image Search and other engines. A total of 7,527,697 images were used, each tile being the average of 140 ...
  • A Million Syllabi

    Free Download — A data set of over a million syllabi gathered by Dan Cohen’s Syllabus Finder tool from 2002 to 2009. It could be the largest collection of syllabi ever gathered by several orders of magnitude. See a more detailed description on Dan Cohen’s blog Format Data are formatted as json records separated by newlines. Caution: this data is messy and comes with no warranty.
  • Password Dictionary

    Offsite — A list of 1,717,680 passwords. Useful for verifying whether or not users are displaying good password hygiene.
  • Word List - 250,000+ Hyphenated, Capitalized and Compound English words

    Free Download — A common word list with over 250,000 entries of hyphenated, capitalized and compound English words. The download consists of entries containing more than one word, as well as capitalized words and acronyms. Phrases are considered “common” if they or variations of them occur in a standard dictionary or thesaurus. This word list is available in a simple, ...
  • Word Lists Collection

    Offsite — The data is a smorgasbord of word lists, including spell check oriented word lists, an inflection database, parts of speech word list, jargon file word lists, the contents from Ispell, spell check dictionaries, tables that convert between American, British and Canadian spellings, and links to several other word lists.
  • Word List - 1000 Most Frequent Words from an Internet Corpus

    Free Download — This file consists of the 1,000 most frequently used English words as used on the Internet computer network in 1992.
  • Word List - 350,000+ Simple English Words (with some Definitions, Excel format)

    Free Download — Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. Some, but not all of the words have definitions. This list does not exclude archaic words or significant variant spellings.
  • Word List - 1,000 Most Frequently Used English Words by Frequency (with Definitions, Excel format)

    Free Download — This file consists of the 1,000 most frequently used English words from a wide variety of common texts listed in decreasing order of frequency