Offsite — The arXiv Physics pre-print publishing corpus
Free Download — This dataset consists of a collection of Infoboxes from Wikipedia on the topic of Arxiv.
Free Download — This dataset lists the ~ 58k tweets that mentioned a scientific article (broadly speaking anything with a DOI, PMID or arxiv ID) between the 1st and 31st of July 2011. Recall isn’t 100%: my best estimate is that it’s missing another ~ 6k tweets where the article couldn’t be identified, the link was malformed or the journal involved is new or gets very low traffic. ...