Authorship in the Era of Big Data

A debate featuring some scientific heavy-hitters may shape the way we assign authorship for meta-analysis papers that include data from published and unpublished literature.


  1. Edit: Part II now available
  2. What do you have to do to be an author on a scientific paper? This question is critically important to our discipline. Authorship is the currency of academia. It determines whether we get jobs or grants, and serves as the primary means by which we communicate our accomplishments to the scientific community.

    However, it is sometimes difficult to determine who should be listed as an author on a given paper. In this era of Big Data, synthesis and meta-analysis papers can integrate the work of hundreds or thousands of individuals into a single study. These papers are important; they are often highly-cited, and enable researchers to draw broad conclusions that would be impossible with relying on their own data alone.

    The optimal way to attribute credit for the data that goes into a meta-analysis is controversial. Many meta-analysis papers list a small number of authors; usually just the people who synthesized the data, analyzed it, and wrote the paper itself. This can become frustrating for field biologists who collect the actual data that appears in syntheses conducted by other researchers.

    Should those who actually collect the data in the field be listed as authors in meta-analyses? A controversy on this topic emerged over the past few days over Twitter between a slew of prolific scientists who study biodiversity and conservation. The story begins below:
  3. The mention of meta-analysis as a "gray area" quickly attracted the attention of a multitude of other researchers who weighed in with their opinions on the subject. 
  4. It was clear that there were a huge variety of opinions on this issue. While the conversation was (mostly) civil, there were stark differences between individuals' interpretations of how to attribute credit for work contributed to a project. People in the conversation identified a few problem factors. First, it is impractical to include all contributors to a meta-analysis as co-authors, simply because it could potentially number in the hundreds or thousands depending on the size of the paper. Second, some journals prevent citation of all data sources due to space requirements. Dr. Bruno had a stark opinion of that.
  5. A third issue that was raised was that sometimes, citations in meta-analysis papers get relegated to supplementary materials, where they do not count towards an authors' citation count. A fourth was the controversy between published and unpublished data: Is there more obligation to offer authorship if the data incorporated in the analysis were not published elsewhere?

    As the conversation progressed, people started to bring together policies that already exist about authorship, data sharing, and attribution of credit in large projects.