Category

Showing 1 - 20 out of 283 datasets

Technology

Not finding the data sets you're looking for? Not all of our data sets are categorized yet. Try checking out tags instead.
  • Digital Element IP Intelligence Geolocation

    No Data — The service for this API is available to customers of the Infochimps Cloud, the fastest way to develop and deploy Big Data applications in public, virtual private and private clouds. Learn more About Digital Element: Digital Element is the premier supplier for IP geolocation data. Their data is used by a number of large internet companies, including Ask.com, AOL, and ...
  • Digital Element IP Intelligence Domains

    No Data — The service for this API is available to customers of the Infochimps Cloud, the fastest way to develop and deploy Big Data applications in public, virtual private and private clouds. Learn more About Digital Element: Digital Element is the premier supplier for IP geolocation data. Their data is used by a number of large internet companies, including Ask.com, AOL, and ...
  • Digital Element IP Intelligence Demographics

    No Data — The service for this API is available to customers of the Infochimps Cloud, the fastest way to develop and deploy Big Data applications in public, virtual private and private clouds. Learn more About Digital Element: Digital Element is the premier supplier for IP geolocation data. Their data is used by a number of large internet companies, including Ask.com, AOL, and ...
  • Twitter Census :: Developer Tools - Mapping from Twitter User Search ID to Twitter API IDs

    Free Download — Twitter data from millions of tweets! This is a download of Twitter data from March 2006 to November 2009. The data comes from analysis on the full set of tweets during that time period, which is 35 million users, over 500 million tweets, and more than 1 billion relationships between users. This dataset maps Twitter screen names to a user’s corresponding Twitter API ID ...
  • Enron Email Dataset

    Offsite — From the CALO Project at Carnegie-Mellon University a massive dataset of emails recovered from discovery documents in the Enron trials About This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a ...
  • Twitter Census - Conversation Metrics: One Year of URLs, Hashtags, Smileys Usage (by Hour)

    Free Download — Twitter data from millions of tweets! This is a download of Twitter data from March 2006 to November 2009. The data set consists of “tokens,” which are hashtags (#data), URLs, or emoticons (Twitter smileys or other “faces” created using keyboard characters). The data comes from analysis on the full set of tweets during that time period, which is 40 million users, 1.6 ...
  • Twitter Census - Conversation Metrics: One year of URLs, Hashtags, Smileys usage (Smiley Counts)

    Free Download — Twitter smiley data from millions of tweets! This is a free download of Twitter data from March 2006 to November 2009. The smiley data comes from analysis on the full set of tweets during that time period, which is 35 million users, over 500 ...
  • Twitter Census - Conversation Metrics: One year of URLs, Hashtags, Smileys usage (monthly)

    Free Download — Twitter data from millions of tweets! This is a download of Twitter data from March 2006 to November 2009. The data set consists of “tokens,” which are hashtags (#data), URLs, or emoticons (Twitter smileys or other “faces” created using keyboard characters). The data comes from analysis on the full set of tweets during that time period, which is 35 million users, over ...
  • Word List - 100,000 + Official Crossword Words (Excel readable)

    Free Download — A word list with over 100,000 entries that are officially permitted in crossword games like Scrabble™. This word list is available in a simple, alphabetically-ordered Excel format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom spelling dictionary. The entries include variants of words: ...
  • Document Metadata Based on a Sample of Web Documents from the Open Directory

    Offsite — DMOZ100k06 is a large research data set about document metadata based on a random sample of 100,000 web documents from the Open Directory combined with data retrieved from the social bookmarking service delicious.com, the content rating system ICRA, and the search engine Google. The data set is freely available for other research. Michael G. Noll
  • AOL Search Data

    Free Download — The AOL Search Data is a collection of real query log data that is based on real users. The data set consists of 20M web queries collected from 650k users over three months. These private searches are perfect for research and mining. The data is sorted by anonymous user ID and sequentially arranged. The collection can be used for personalization, query reformulation or ...
  • AOL Search Data Mirrors

    Offsite — This collection consists of ~20M web queries collected from ~650k users over three months. The data is sorted by anonymous user ID and sequentially arranged. The goal of this collection is to provide real query log data that is based on real users. It could be used for personalization, query reformulation or other types of search research. From AOL’s original Read-Me ...
  • A list of all 22,802 words in the Scribblenauts dictionary.

    Free Download — List of summonable objects from the Nintendo DS game Scribblenauts, from AARDVARK, ABOMINABLE SNOWMAN and ABSCONDER to ZOMBIE, ZUNICERATOPS and ZYGOTE. via the Scribblenauts Wikipedia entry: Scribblenauts is an emergent puzzle action video game with the tagline “Write Anything, Solve Everything”. Its objective is to complete puzzles by summonning any object (from a ...
  • Twitter Census: Trst Rank

    Free Download — The service for this API has ceased Our apologies for the inconvenience this may cause. You can find a download of the data set for this API on this page Twitter influence metrics with the click of a button! Trstrank measures Twitter user reputation, importance and influence in a way far more robust than counting the number of followers. It is a sophisticated measure ...
  • Marvel Universe Social Graph

    Free Download — A fun Marvel Comics character collaboration graph constructed by Cesc Rosselló, Ricardo Alberich, and Joe Miro from the University of the Balearic Islands. The Marvel Universe, that is, the artificial world that takes place in the universe of the Marvel comic books, is an example of a social collaboration network. They compare the characteristics of this universe to ...
  • Twitter Census: Influence Metrics

    No Data — The service for this API has ceased Our apologies for the inconvenience this may cause. Twitter influence – how to measure it? Let us count the ways: enthusiasm, feedness, sway, follow churn, trstrank, followers, outflux, interesting, chattiness, follow rate, influx… How Does The Twitter Influence API Work? The Twitter Influence API performs a “scrape” of the social ...
  • Adult Computer and Adult Internet Users, by Selected Characteristics: 1995 to 2006

    Free Download — The Statistical Abstract files are distributed by the US Census Department as Microsoft Excel files. These files have data mixed with notes and references, multiple tables per sheet, and, worst of all, the table headers are not easily matched to their rows and columns. A few files had extraneous characters in the title. These were corrected to be consistent. A few files ...
  • U.S. Government Photos and Graphics

    Offsite — About Collection of links to different US image collections in the public domain. Openness > Most of these images and graphics are available for use in the public domain, and they may be used and reproduced without permission or fee. However, some images may be protected by license. We strongly recommend you thoroughly read the disclaimers on each site before use. ...
  • Measuring Worth: Interest Rates - US, UK, China, Japan

    Offsite — The mission of the site is to make available to the public the highest quality and most reliable historical data on important economic aggregates, with particular emphasis on nominal measures. The data have been created using the highest standards of the fields of economics and history and are rigorously refereed by the most distinguished researchers in the fields. ...
  • Disasters worldwide from 1900-2008

    Free Download — Disaster data from 1900 – 2008, organized by start and end date, country (and sub-location), disaster type (and sub-type), disaster name, cost, and persons killed and affected by the disaster. Create disaster data trend reporting, based on geography, frequency, date or nature of the event. Design a visualization or time lapse illustrating disaster events around the ...