Tag

language

Showing 1 - 20 out of 64 datasets
  • Word List - 100,000 + Official Crossword Words (Excel readable)

    Free Download — A word list with over 100,000 entries that are officially permitted in crossword games like Scrabble™. This word list is available in a simple, alphabetically-ordered Excel format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom spelling dictionary. The entries include variants of words: ...
  • Word List of 64,000+ Common English Dictionary Words (most with definitions, Excel format)

    Free Download — Over 64,000 common dictionary words — A list of words in common with two or more published dictionaries. This gives the developer of a custom spelling checker a good beginning pool of relatively common words.
  • Word List - 10,000+ Common Place Names

    Free Download — U.S. place names for more than 10,000 entries. This U.S. place name list is available in a simple, alphabetically-ordered .txt format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom location tool or database. The entries represent a sampling of U.S. place names: 10,196 places in total.
  • Word List 80,000+ Official Crossword Words (with most definitions, Excel format)

    Free Download — A list of over 80,000 words officially permitted in crossword games like Scrabble™ with some but not all of their definitions. The words are compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has variants of words: -ing, -ed, -s, and so on, it makes a good addition when building a custom spelling dictionary. It is an reference ...
  • Children Who Speak a Language Other Than English at Home: 2000 to 2004

    Free Download — The Statistical Abstract files are distributed by the US Census Department as Microsoft Excel files. These files have data mixed with notes and references, multiple tables per sheet, and, worst of all, the table headers are not easily matched to their rows and columns. A few files had extraneous characters in the title. These were corrected to be consistent. A few files ...
  • Word List - 100,000+ official crossword words (Excel readable)

    Free Download — 113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spelling dictionary.
  • Language Spoken at Home - Cities of 100,000 or More: 2005

    Free Download — The Statistical Abstract files are distributed by the US Census Department as Microsoft Excel files. These files have data mixed with notes and references, multiple tables per sheet, and, worst of all, the table headers are not easily matched to their rows and columns. A few files had extraneous characters in the title. These were corrected to be consistent. A few files ...
  • Languages Spoken at Home by Language: 2005

    Free Download — The Statistical Abstract files are distributed by the US Census Department as Microsoft Excel files. These files have data mixed with notes and references, multiple tables per sheet, and, worst of all, the table headers are not easily matched to their rows and columns. A few files had extraneous characters in the title. These were corrected to be consistent. A few files ...
  • Word List - 350,000+ Simple English Words (Excel readable)

    Free Download — Over 354,000 single words, excluding proper names, acronyms, or compound words and phrases. This list does not exclude archaic words or significant variant spellings.
  • Foreign Language Enrollments in Public High Schools by Type of Language: 1970 to 2000

    Free Download — The Statistical Abstract files are distributed by the US Census Department as Microsoft Excel files. These files have data mixed with notes and references, multiple tables per sheet, and, worst of all, the table headers are not easily matched to their rows and columns. A few files had extraneous characters in the title. These were corrected to be consistent. A few files ...
  • Language Spoken at Home by State: 2005

    Free Download — The Statistical Abstract files are distributed by the US Census Department as Microsoft Excel files. These files have data mixed with notes and references, multiple tables per sheet, and, worst of all, the table headers are not easily matched to their rows and columns. A few files had extraneous characters in the title. These were corrected to be consistent. A few files ...
  • TalkBank

    Offsite — About About TalkBank: > The goal of TalkBank is to foster fundamental research in the study of human and animal communication. It will construct sample databases within each of the subfields studying communication. It will use these databases to advance the development of standards and tools for creating, sharing, searching, and commenting upon primary materials via ...
  • The Kids Open Dictionary Builder

    Offsite — About From the creators: > The purpose of this project is to create a free, open simple dictionary for students to use. The words in the dictionary will reviewed for quality and appropriateness and ultimately “frozen” for export into a variety of formats, including text, PDF, ebooks, wikis, web, etc., for use on a variety of platforms. > The site also includes a ...
  • FSI Language Courses

    Offsite — About From website: > Welcome to fsi-language-courses.com, the home for language courses developed by the Foreign Service Institute. These courses were developed by the United States government and are in the public domain. > This site is dedicated to making these language courses freely available in an electronic format. This site is not affiliated in any way with ...
  • Dict.cc - English German Dictionary

    Offsite — About From [about page](http://www.dict.cc/?s=about%3A): > dict.cc is not only an online dictionary. It’s an attempt to create a platform where users from all over the world can share their knowledge in the field of translations. Every visitor can suggest new translations and correct or confirm other users’ suggestions. The challenging and most important part of the ...
  • The Speech Accent Archive

    Offsite — From website: > The speech accent archive uniformly presents a large set of speech samples from a variety of language backgrounds. Native and non-native speakers of English read the same paragraph and are carefully transcribed. The archive is used by people who wish to compare and analyze the accents of different English speakers. On [about ...
  • ISO language, territory, currency codes and their translations

    Offsite — Description This is a set of ISO codes including those for country and currency collected together into a useful package by the Debian project. From the package page: > This package provides the ISO-639 Language code list, the ISO-4217 currency list, the ISO-3166 Territory code list, and ISO-3166-2 sub-territory lists. > > It also (more importantly) provides their ...
  • Statistical Machine Translation - Europarl Parallel Corpus

    Offsite — About Overview: > The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 11 European languages: Romanic (French, Italian, Spanish, Portuguese), Germanic (English, Dutch, German, Danish, Swedish), Greek and Finnish. > The goal of the extraction and processing was to generate sentence aligned text for ...
  • The DGT Multilingual Translation Memory of the Acquis Communautaire

    Offsite — As of November 2007, the European Commission’s Directorate-General for Translation (DGT) made publicly accessible its multilingual Translation Memory for the Acquis Communautaire (the body of EU law) – a collection of parallel texts (texts and their translation, also referred to as bi-texts) in 22 languages. This is a page for technical users, where you will find a ...
  • MOCHA-TIMIT

    Offsite — About Authors: Alan Wrench, Queen Margaret University College. Funded by: Engineering and Physical Sciences Research Council. When created: November 1999. Purpose: Phonetically balanced dataset for training an automatic speech recognition system Openness Availability: English speakers available here free for non-commercial use and may be distributed on CDROM for a ...