Researchers working to improve participant learning through Zooniverse

Our research group at Syracuse University spends a lot of time trying to understand how participants master tasks given the constraints they face. We conducted two studies as a part of a U.S. National Science Foundation grant to build Gravity Spy, one of the most advanced citizen science projects to date (see: We started with two questions: 1) How best to guide participants through learning many classes? 2) What type of interactions do participants have that lead to enhanced learning?  Our goal was to improve experiences on the project. Like most internet sites, Zooniverse periodically tries different versions of the site or task and monitors how participants do.

We conducted two Gravity Spy experiments (the results were published via open access: article 1 and article 2). Like in other Zooniverse projects, Gravity Spy participants supply judgments to an image subject, noting which class the subject belongs to. Participants also have access to learning resources such as the field guide, about pages, and ‘Talk’ discussion forums. In Gravity Spy, we ask participants to review spectrograms to determine whether a glitch (i.e., noise) is present. The participant classifications are supplied to astrophysicists who are searching for gravitational waves. The classifications help isolate glitches from valid gravitational-wave signals.

Gravity Spy combines human and machine learning components to help astrophysicists search for gravitational waves. Gravity Spy uses machine learning algorithms to determine the likelihood of a glitch belonging to a particular glitch class (currently, 22 known glitches appear in the data stream); the output is a percentage likelihood of being in each category.

Figure 1. The classification interface for a high level in Gravity Spy

Gradual introduction to tasks increases accuracy and retention. 

The literature on human learning is unclear about how many classes people can learn at once. Showing too many glitch class options might discourage participants since the task may seem too daunting, so we wanted to develop training while also allowing them to make useful contributions. We decided to implement and test leveling, where participants can gradually learn to identify glitch classes across different workflows. In Level 1, participants see only two glitch class options; in Level 2, they see 6; in Level 3, they see 10, and in Level 4, 22 glitch class options. We also used the machine learning results to route more straightforward glitches to lower levels and the more ambiguous subjects to higher workflows. So participants in Level 1 only saw subjects that the algorithm was confident a participant could categorize accurately. However, when the percentage likelihood was low (meaning the classification task became more difficult), we routed these to higher workflows.

We experimented to determine what this gradual introduction into the classification task meant for participants. One group of participants were funneled through the training described above (we called it machine learning guided training or MLGT);  another group of participants was given all 22 classes at once.  Here’s what we found:  

  • Participants who completed MLGT were more accurate than participants who did not receive the MLGT (90% vs. 54%).  
  • Participants who completed MLGT executed more classifications than participants who did not receive the MLGT (228 vs. 121 classifications).
  • Participants who completed MLGT had more sessions than participants who did not receive the MLGT (2.5 vs. 2 sessions). 

The usefulness of resources changes as tasks become more challenging

Anecdotally, we know that participants contribute valuable information on the discussion boards, which is beneficial for learning. We were curious about how participants navigated all the information resources on the site and whether those information resources improved people’s classification accuracy. Our goal was to (1) identify learning engagements, and (2) determine if those learning engagements led to increased accuracy. We turned on analytics data and mined these data to determine which types of interactions (e.g., posting comments, opening the field guide, creating collections) improved accuracy. We conducted a quasi-experiment at each workflow, isolating the gold standard data (i.e., the subjects with a known glitch class). We looked at each occasion a participant classified a gold standard subject incorrectly and determined what types of actions a participant made between that classification and the next classification of the same glitch class. We mined the analytics data to see what activities existed between Classification A and Classification B. We did some statistical analysis, and the results were astounding and cool. Here’s what we found:  

  • In Level 1, no learning actions were significant. We suspect this is because the tutorial and other materials created by the science team are comprehensive, and most people are accurate in workflow 1 (~97%).
  • In Level 2 and Level 3, collections, favoriting subjects, and the search function was most valuable for improving accuracy. Here, participants’ agency seems to help to learn. Anecdotally, we know people collect and learn from ambiguous subjects.
  • In Level 4, we found that actions such as posting comments and, viewing the collections created by other participants were most valuable for improving accuracy. Since the most challenging glitches are administered in workflow 4, participants seek feedback from others.

The one-line summary of this experiment is that when tasks are more straightforward, learning resources created by the science teams are most valuable; however, as tasks become more challenging, learning is better supported by the community of participants through the discussion boards and collections. Our next challenge is making these types of learning engagements visible to participants.

Note: We would like to thank the thousands of Gravity Spy participants without whom this research would not be possible. This work was supported by a U.S. National Science Foundation grant No. 1713424 and 1547880. Check out Citizen Science Research at Syracuse for more about our work.

supernova hunters and nine lessons for curious people

At the weekend, a bunch of us had fun with a timely challenge – trying to find and follow-up supernovae with supernova hunters as part of the Nine Lessons and Carols for Curious People 24 hour science/music/comedy show organised by Robin Ince and the Cosmic Shambles Network in support of various good causes. Robin and Brian Cox normally run a huge show at the Hammersmith Apollo theatre at this time of year, but this socially distant, marathon show was a suitable replacement.

Robin and musician Steve Pretty somewhere in the middle of the 24 and a bit hour long show – they were on stage throughout! Credit:

In the run up to the show there was some concern that poor weather in Hawai’i – where the PanSTARRS telescope that provides data for Supernova Hunters is located – might prevent us getting enough data, but in the event skies were clear. Very clear. Which caused a problem as the extra data took a while to get to the servers at Queen’s University Belfast and from there to us, but thanks to heroic efforts from the Supernova Hunters team, I was able to zoom into the show early on and pointed the viewers to the site, and classifications started to flow in.

Supernova hunting is a competitive sport these days, and though the early results from volunteers were encouraging, most of what we found was either too faint to make follow-up easy with the telescopes we had on stand by or were objects already identified by other surveys (including the Zooniverse’s friends at ZTF). A brief reappearance on the Nine Lessons big screen (and an email to existing volunteers asking for help) later and we finally had a set of good candidates.

Liverpool Telescope in the Canary Islands, which was responsible for our first follow-up observations. Credit: Liverpool Telescope.

The team – especially Ken Smith and Darryl Wright – worked overnight to arrange follow-up. When I emerged from a few hours sleep observers at the Liverpool Telescope had checked out our most promising candidate – but it turned out not to be a supernova, but rather a less extreme cosmic explosion known as a cataclysmic variable. I marvelled at the fact Robin was still awake – and was coherently interviewing cosmologists, brain scientists and the odd astronaut – and gave an update.

Just after I finished, Belfast’s Ken Smith popped up with the news that observers in Hawai’i using the SNIFS instrument had followed up other targets – and one of them was a real supernova! Better, it was a type 1a – the kind of supernova that can be used to measure the expansion rate of the Universe. Admittedly it was a type 1a-91bg, a rarer type of supernova which is fainter than a normal type 1a, but still useful, and this gave us a payoff for the show.

Spectrum confirming our candidate is a SN1a-91bg associated with a galaxy at redshift z=0.061 – light from an explosion that happened nearly a billion years ago.

Using only that supernova, a bit of maths on the back of an envelope and a few fairly shaking assumptions, we calculated that the Universe was 12.8 billion years old, about a billion short of the commonly accepted value. I wouldn’t throw out the careful systematic analysis of populations of supernova for this simple calculation – but we did get to announce to a bleary eyed comedian that the Universe might be (a little bit) younger than expected.

Just as I went on air a message from Mark Huber, the observer providing data from Hawai’i, confirmed a second supernova – this one a type II, an exploding massive star. It might even be of the same type as the famous 1987A which was spotted in a satellite galaxy of the Milky Way, the Large Magellanic Cloud. Trying to take this in, and convey what was happening quickly was bit much for my sleep-deprived brain but hopefully people realised we confirmed a second supernova!

More importantly, we’ve recorded the results of all of our discoveries in a Astronote published on the Transient Name Server website (the worldwide clearing house for such discoveries). You can read the result of a Supernova Hunters weekend here – and rejoice in the fact that Robin Ince and some of the Cosmic Shambles team are now coauthors on a scientific publication!

I’ll post links to clips from the show when they’re available too, and if you fancy supernova hunting yourself there will be more data on the site soon!


PS Thanks a million to the Supernova Hunters volunteers, and to the team that made it happen – Brooke Simmons (Lancaster), Ken Smith (Belfast), Darryl Wright (Mayo Clinic), Coleman Krawczyk (Portsmouth) and Grant Miller and Belinda Nicholson (Oxford). Michael Fulton and Shubham Srivastav from QUB took the Liverpool Telescope observations, and Michael also led the publication of our AstroNote.

PPS This gives Robin Ince a Erdös Number of, I think, no higher than 5. His Bacon number (according to the Infinite Monkey Cage) is no higher than 3, so this gives him a Bacon-Erdös number of no more than 15! More importantly, as he’s performed music on stage, he must have a Sabbath number, though finding out what it is requires further work – making him one of the rare number of individuals with EBS numbers. A suitable reward for 24 hours of effort.