Dataset

Given Name Frequency Project

Added By Infochimps

Quite a bit of data is available for download but only individually (not in a single file). According to web page have have:

> * GINAP – code to standardize given names and correct common problems in name samples. Such standardization is an important step in analysis of given names.
> * Popular given names, US 1801 to 1999 – a collection of sets of standardized female and male names by decade, with counts of occurrences for names with more than 10 occurrances in the samples.
> * Samples of names from England before 1800 – name samples from a diverse set of sources, with raw and standardized names available.
> * Popularity of the name Mary over the past 800 years. For discussion of related sixteenth-century English history, Shakespeare family history, and Maria and Olivia in Twelfth Night, see Sense in Communication (Secttion IV, pp. 82-112).
> * Sample of cotton workers in Manchester, 1818-19 – relatively rich dataset used to study the development of the early factory workforce. Also a source of personal names.