This dataset lists the ~ 58k tweets that mentioned a scientific article (broadly speaking anything with a DOI, PMID or arxiv ID) between the 1st and 31st of July 2011.

Recall isn’t 100%: my best estimate is that it’s missing another ~ 6k tweets where the article couldn’t be identified, the link was malformed or the journal involved is new or gets very low traffic.

Twitter’s TOS prohibit re-distribution of the tweets themselves but the dataset contains the extracted links, the tweet ID and some information about the tweeter (screen name, country & lat/lng derived from their location using Yahoo! Placemaker).

The links, pmids, dois and arxiv_ids columns can contain more than one value are are pipe (|) delimited.

If you use this dataset please credit somewhere – doesn’t need to be a prominent link or a graphic or anything, some text tucked away on an about page will do!


Creative Commons BY

