Tag

retrieval

2 datasets
  • Document Metadata Based on a Sample of Web Documents from the Open Directory

    Offsite — DMOZ100k06 is a large research data set about document metadata based on a random sample of 100,000 web documents from the Open Directory combined with data retrieved from the social bookmarking service delicious.com, the content rating system ICRA, and the search engine Google. The data set is freely available for other research. Michael G. Noll
  • TREC-9 Filtering Track Collections - MEDLINE Extract with Relevance Measures

    Offsite — The OHSUMED test collection is a set of 348,566 references from MEDLINE, the on-line medical information database, consisting of titles and/or abstracts from 270 medical journals over a five-year period (1987-1991). The available fields are title, abstract, MeSH indexing terms, author, source, and publication type. License The National Library of Medicine has agreed to ...