4 datasets
  • The arXiv on your harddrive

  • The arXiv in your pocket - Downloadable Physics Pre-Print Archive

    Offsite — The arXiv Physics pre-print publishing corpus
  • Arxiv

    Free Download — This dataset consists of a collection of Infoboxes from Wikipedia on the topic of Arxiv.
  • Tweets linking to scientific papers - Jul 2011

    Free Download — This dataset lists the ~ 58k tweets that mentioned a scientific article (broadly speaking anything with a DOI, PMID or arxiv ID) between the 1st and 31st of July 2011. Recall isn’t 100%: my best estimate is that it’s missing another ~ 6k tweets where the article couldn’t be identified, the link was malformed or the journal involved is new or gets very low traffic. ...