Tag

duplicate

4 datasets
  • Connexions

    Offsite — Is this not a duplicate of the existing connexions package? “Connexions is a place to view and share educational material made of small knowledge chunks called modules that can be organized as courses, books, reports, etc.” Creative Commons Attribution License
  • LSDIS : SwetoDblp

    Offsite — SwetoDblp is a large-size ontology (spin-off of SWETO ontology) focused on bibliography data of Computer Science publications where the main data source is DBLP (Digital Bibliography & Library Project). SwetoDblp was created from a large XML document available at DBLP’s website and other datasets that are used to add relationships to other entities such as Publishers, ...
  • Vaccines: IIS/Tech/Deduplication Test Cases

    Offsite — NIP (now called the National Center for Immunization and Respiratory Diseases) developed a toolkit to assist immunization information systems (IIS) in the evaluation of their deduplication algorithms. This toolkit helps registries assess their system’s ability to prevent/remove duplicate records. The data and procedures in this toolkit can help identify strengths and ...
  • Duplicate Detection, Record Linkage, and Identity Uncertainty: Datasets

    Offsite — The following datasets have been provided for evaluating duplicate detection, record linkage, and identity uncertainty systems. Several of these are not yet available for downloading; please contact the authors. The datasets include a segmented citation dataset based on the Cora research paper search engine, a collection of 864 restaurant records from the Fodor’s and ...