7 datasets
  • Retrosheet: Game Logs (box scores) for Major League Baseball Games

    Offsite — Retrosheet baseball data for Major League games played from 1871 – 2008. Retrosheet provides a listing of the date and score of each game. Baseball data records may include team statistics, winning and losing pitchers, linescores, attendance, starting pitchers, umpires and more. There are 161 fields in each record, described in more detail in the Guide to Retrosheet ...
  • AOL Search Data

    Free Download — The AOL Search Data is a collection of real query log data that is based on real users. The data set consists of 20M web queries collected from 650k users over three months. These private searches are perfect for research and mining. The data is sorted by anonymous user ID and sequentially arranged. The collection can be used for personalization, query reformulation or ...
  • PigTutorial - Pig Wiki

    Offsite — Apache Pig is a platform for analyzing large data sets. Pig’s language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing. Pig Latin ...
  • Retrosheet MLB Park IDs

    Free Download — Most of the Retrosheet data uses a Park ID in place of the name of the field. This dataset resolves the park ID to a field name. Format Column headers are in first row. PARKID|NAME|CITY|STATE|START DATE|END DATE|LEAGUE License The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at ...
  • WSCD09: Workshop on Web Search Click Data 2009

  • Instructions for Obtaining Search Engine Transaction Logs

  • Log and Sine Table

    Free Download — This is a stylized Excel spreadsheet that gives the result of various mathematical functions applied to given real numbers (n) – such as the log (base 10), the natural log, π*n, n^2, or n^3 – as well as the sine, cosine and tangent of an angle (θ) in degrees and radians, and miscellaneous other functions, with the intention of being used for reference. This data is one ...