Category Archives: Studying the Zooniverse

Experiments on the Zooniverse

Occasionally we run studies in collaboration with external  researchers in order to better understand our community and improve our platform. These can involve methods such as A/B splits, where we show a slightly different version of the site to one group of volunteers and measure how it affects their participation, e.g. does it influence how many classifications they make or their likelihood to return to the project for subsequent sessions?

One example of such a study was the messaging experiment we ran on Galaxy Zoo.  We worked with researchers from Ben Gurion University and Microsoft research to test if the specific content and timing of messages presented in the classification interface could help alleviate the issue of volunteers disengaging from the project. You can read more about that experiment and its results in this Galaxy Zoo blog post https://blog.galaxyzoo.org/2018/07/12/galaxy-zoo-messaging-experiment-results/.

As the Zooniverse has different teams based at different institutions in the UK and the USA, the procedures for ethics approval differ depending on who is leading the study. After recent discussions with staff at the University of Oxford ethics board, to check our procedure was up to date, our Oxford-based team will be changing the way in which we gain approval for, and report the completion of these types of studies. All future study designs which feature Oxford staff taking part in the analysis will be submitted to CUREC, something we’ve been doing for the last few years. From now on, once the data gathering stage of the study has been run we will provide all volunteers involved with a debrief message.

The debrief will explain to our volunteers that they have been involved in a study, along with providing information about the exact set-up of the study and what the research goals were. The most significant change is that, before the data analysis is conducted, we will contact all volunteers involved in the study allow a period of time for them to state that they would like to withdraw their consent to the use of their data. We will then remove all data associated with any volunteer who would not like to be involved before the data is analysed and the findings are presented. The debrief will also contain contact details for the researchers in the event of any concerns and complaints. You can see an example of such a debrief in our original post about the Galaxy Zoo messaging experiment here https://blog.galaxyzoo.org/2015/08/10/messaging-test/.

As always, our primary focus is the research being enabled by our volunteer community on our individual projects. We run experiments like these in order to better understand how to create a more efficient and productive platform that benefits both our volunteers and the researchers we support. All clicks that are made by our volunteers are used in the science outcomes from our projects no matter whether they are part of an A/B split experiment or not. We still strive never to waste any volunteer time or effort.

We thank you for all that you do, and for helping us learn how to build a better Zooniverse.

Studying the Impact of the Zooniverse

Below is a guest post from a researcher who has been studying the Zooniverse and who just published a paper called ‘Crowdsourced Science: Sociotechnical epistemology in the e-research paradigm’. That being a bit of a mouthful, I asked him to introduce himself and explain – Chris.

My name is David Watson and I’m a data scientist at Queen Mary University of London’s Centre for Translational Bioinformatics. As an MSc student at the Oxford Internet Institute back in 2015, I wrote my thesis on crowdsourcing in the natural sciences. I got in touch with several members of the Zooniverse team, who were kind enough to answer all my questions (I had quite a lot!) and even provide me with an invaluable dataset of aggregated transaction logs from 2014. Combining this information with publication data from a variety of sources, I examined the impact of crowdsourcing on knowledge production across the sciences.

Last week, the philosophy journal Synthese published a (significantly) revised version of my thesis, co-authored by my advisor Prof. Luciano Floridi. We found that Zooniverse projects not only processed far more observations than comparable studies conducted via more traditional methods—about an order of magnitude more data per study on average—but that the resultant papers vastly outperformed others by researchers using conventional means. Employing the formal tools of Bayesian confirmation theory along with statistical evidence from and about Zooniverse, we concluded that crowdsourced science is more reliable, scalable, and connective than alternative methods when certain common criteria are met.

In a sense, this shouldn’t really be news. We’ve known for over 200 years that groups are usually better than individuals at making accurate judgments (thanks, Marie Jean Antoine Nicolas de Caritat, aka Marquis de Condorcet!) The wisdom of crowds has been responsible for major breakthroughs in software development, event forecasting, and knowledge aggregation. Modern science has become increasingly dominated by large scale projects that pool the labour and expertise of vast numbers of researchers.

We were surprised by several things in our research, however. First, the significance of the disparity between the performance of publications by Zooniverse and those by other labs was greater than expected. This plot represents the distribution of citation percentiles by year and data source for articles by both groups. Statistical tests confirm what your eyes already suspect—it ain’t even close.

Influence of Zooniverse Articles

We were also impressed by the networks that appear in Zooniverse projects, which allow users to confer with one another and direct expert attention toward particularly anomalous observations. In several instances this design has resulted in patterns of discovery, in which users flag rare data that go on to become the topic of new projects. This structural innovation indicates a difference not just of degree but of kind between so-called “big science” and crowdsourced e-research.

If you’re curious to learn more about our study of Zooniverse and the site’s implications for sociotechnical epistemology, check out our complete article.