The thing that I was particularly interested in was looking at differences among countries. For instance, do editors in some countries choose reviewers from a broader geographic region than others? Do editors choose reviewers from their hemisphere?
To start digging into this, we first need to do a bit of data munging. The data as provided by Publons is in an XML type format, and since I am going to be in R the first thing that I was interested in was just converting this to a matrix. The matrix should be set up such that there is a row for every country with at least one handling editor and a column for every country that has as performed at least one review. I should note that I have made a couple of changes from the original dataset: 1) countries of the United Kindom have been combined 2) likewise data associated with Hong Kong or Macau have been combined with that for China 3) I've dropped a couple of countries that had very little data and are not widely recognized as independent states. Here is a link to the processed data publons.csv.
Although we could look at this just by country I wanted a few more dimensions that we could explore. Specifically, I was interested in distance between pairs. To get this, I constructed a 123x123 matrix that represents the distance between all countries. The diagonal of this matrix is the average distance within a country. Briefly, this value is estimated as the distance between two randomly drawn points from a circle with the same area as the country in question (128r/45pi). This assumption may bias us towards smaller within country distances since all non-circle shaped countries would have larger expected distances. On the other hand reviewers and editors are concentrated in cities, so perhaps these will balance out a bit. Here is a link to the country distance matrix distances.csv.
In addition to the distance matrix, I also wanted to have some measure of the number of articles being produced by scientists in each country. I found this data on the natureindex.com website. Specifically, I will be using the weighted fractional count (WFC). This metric takes into account the number of authors and the availability of data from different STEM fields. Although I would prefer to have the actual number of potential reviewers in each country, the WFC should be highly correlated and was easy to find. Here is a link to the WFC data formatted to match the Publons data: wfc.csv. Next, we might simply look and see if we are right that in general editors get more reviews from countries that are close by and have lots of scientists producing articles. To do this let's just divide the distance between each editor reviewer pair by the WFC of the reviewing country and then plot this against the number of reviews produced.
Naively we might imagine that simply the number of editor in a country should predict the number of reviews coming from a country. Graphically we could get this by just plotting the row sums vs. column sums from our Publons data matrix:
Countries that fall below the diagonal have more editor initiations while those above have more reviews produced. It is important to note that the data has been plotted on a log scale so that deviations from this line are arguably more striking as we move to the upper right quadrant of the plot. One interesting thing that we can note right away is the cluster of countries on the far left of the graph. These are countries that had zero handling editors but are still producing reviews. In case you're curious those countries are: UAE, Qatar, Georgia, Yemen, Tanzania, Slovakia, Nepal, Cameroon, Vietnam, Venezuela, Botswana, Zimbabwe, Philippines, Latvia, Bosnia & Herzegovina, Senegal, Papua New Guinea, Bahrain, Cambodia, Belarus, Zambia, Malawi, Kyrgyzstan, Armenia, and Sudan. The highest of these is Nepal which is responsible for 48 reviews but no handling editors.
So my initial interest in this dataset was really about those countries that are producing lots of science but fall relatively far above or below this line. To look at this, the first thing that we will do is slim our data to focus on those countries that had at least 100 handling editor counts. Next, I've just plotted the difference in reviews minus editors.
So who are the countries forming the end of this distribution? Well at the negative end of this distribution we find the U.S. followed by the U.K., Canada, and China. Meanwhile, the positive end of this distribution is formed by Germany, Portugal, Italy, and Spain. These residuals might be a bit misleading though. For instance, a 1% imbalance between reviewers and editors will be much larger for the UK than the for New Zeland simply due to the difference in the size of the research communities. One way we might control for this is by standardizing these residuals relative to Nature's WFC
So this does certainly cause changes in our indexing of countries, but interestingly these aren't dramatic. For instance, the U.S., Canada, and the UK are still in the bottom 5. Likewise, Portugal, Italy, and Spain are in the top 6 based on this ranking. These two plots suggest to me that there might be some real differences in the balance between handling editors and review production across countries.
If the differences are significant, then we should be able to recover something with an ANCOVA. My approach here is to fit a linear model:
reviews = distance + WFC
The ANCOVA lets us ask whether the slope associated with this model is different among editor countries. If editors from a country are strongly biased towards choosing reviewers close to them, this will have a more negative slope. On the other hand, if reviewers are biased towards picking reviewers from some distant country for some reason it could even have a positive value (the very first plot suggests this unlikely). Here are the results from running the ANCOVA in R:
Sum Sq Df F value Pr(>F)
dist 41645 1 4.2957 0.03837
WFC 783958 1 80.8652 2e-16
dist:editor 723619 85 0.8781 0.77678
Residuals 15026658 1550
So this reveals that the overall slope doesn't appear to be significantly different among editor countries. The one thing that might remain is that some countries could have a large deviation from our model due to some odd historical contingency. The one example that I could think of to look at for this is Commonwealth states. So a good comparison here might be the US and Canada and what proportion of reviews editors in these countries get from the UK and Australia. As the plot below shows, we don't see any indication that this historical contingency has created a bias in editors choices for reviews. In fact, they are strikingly similar.
Obviously, everything that I did here could be affected by the uptake of the Publons peer review tracking site. If it has a systematic bias with regard to native language or geographic region it may have led me astray. Regardless, thanks to the kind people at Publons for allowing me to play with their data.
The code used to produce all of these plots and tests is here: publons.R.
Add a comment