The path to scientific discovery: Distribution of labor, productivity and innovation in open science

The Polymath project is the first complete example of an original research paradigm defined by M. Nielsen as “Networked science”. Launched in 2009, the Polymath project is organized around a series of blogs, each of which starts with the proposition of a research question and continues through a discussion to which anybody can contribute by simply registering to the platform. The aim of this experiment of collaborative mathematics was to show of a different way to conduct research, not only through cooperation, but through a massive and completely open collaboration between hundreds of people varying from renown mathematicians to simple amateurs. Our paper investigates the collaboration and innovation dynamics of this unique mathematical experience.
Out of the 16 blogs launched so far, we collected and analyzed the posts of the five Polymath projects concluded with a collective peer reviewed publication. As in most other collaborative projects (like GitHub, Linux, Wikipedia), our study of Polymath reveals a clear labor hierarchy, with the users’ activity distributed according to a steep power law (with extremely similar slopes in the five projects) and elevate Gini indices. Secondly, we analyzed the productivity of the collaboration using multiple metrics. Figure 1, for example show the number of posts grows super-linearly (with exponent 1.46) with the increase of active users: in the line with the old saying that 1+1>2. In collaborative science, the sheer number of participants, independently from their actual activity, is a driving force for the increasing of global productivity.
Considering the content of the posts in terms of the set of mathematical concepts they contain, we confirmed that all the projects follow the typical laws of texts: the Zipf’s and the Heaps’ law. The exponents are equal for the different projects, suggesting the presence of a universal growth mechanism of the debate.
We finally introduce a measure of innovation for the posts and we show again that the number of participant to the debate is important to trigger innovation. Even if in most of the projects, the largest part of innovation is produced by a handful of hyperactive users, the presence of “supportes” in the tail of distribution is an important determinant of the overall scientific productivity.
While focussed on the very specific topic of the theory of numbers, the traceability of the Polymath project makes its blogs a unique terrain to study the development of scientific discovery and this work constitutes a first dive into the potentiality of this dataset.

Συνεδρία: 
Authors: 
Floriana Gargiulo, Maria Castaldo and Tommaso Venturini
Room: 
4
Date: 
Friday, December 11, 2020 - 13:50 to 14:05

Partners

Twitter

Facebook

Contact

For information please contact :
ccs2020conf@gmail.com