Panoptes CLI 1.0, a command-line interface for managing projects

Following on from the release of Panoptes Client 1.0 for Python, we’ve just released version 1.0 of the Panoptes CLI. This is a command-line client for managing your projects, because some things are just easier in a terminal! The CLI lets you do common project management tasks, such as activating workflows, linking subject sets, downloading data exports, and uploading subjects. Let’s jump in with a few examples.

First, downloading a classification export (obviously you’d insert your own project ID and a filename of your choice):

panoptes project download 764 Downloads/pulsar-hunters-classifications.csv

cli-classification-download.gif

This command will optionally generate a new export and wait for it to be ready before downloading. No more waiting for the notification email!

New subjects can be uploaded to a new subject set like so (again, inserting your own IDs):

panoptes subject-set create 7 "November 2017 subjects"
panoptes subject-set upload-subjects 16401 manifest.csv

cli-subject-upload.gif

You can also pipe the output from the CLI into other standard commands to do more powerful things, such as linking every subject set in your project to a workflow using the xargs command (where 1234 and 5678 are your project ID and workflow ID respectively):

panoptes subject-set ls -q -p 1234 | xargs panoptes workflow add-suject-sets 5678

Visit GitHub to get started with the CLI today!

Advertisements

A Late Night at the Museum

At the end of October, The Zooniverse team was invited to the Natural History Museum in London to be part of the Museum’s monthly Lates event program.

(Photos courtesy of the Etch A Cell team)

The event was organised by the ConSciCom team who have partnered with the Zooniverse to create two very successful projects – Science Gossip and Orchid Observers. The theme for the evening was to explore the role images, such as illustrations and photographs, have played within natural history and scientific research.

From studying animal behaviour using photos taken by camera traps, to advancing our understanding of cell biology with photos from microscopes, many Zooniverse projects improve our understanding of the world around us through the help of citizen scientist volunteers.

Teams from multiple Zooniverse projects, including BashTheBug, Etch A Cell, Notes from  Nature, Orchid Observers, Science Gossip and Seabird Watch, attended the event and spent the evening speaking to people about their projects, and showing how anyone can contribute to real research through citizen science.

(Photos courtesy of the Etch A Cell team and Jim O’Donnell)

Illustrator Dr Makayla Lewis led a live gallery drawing event, asking visitors to pick up a pencil and spend 15 minutes sketching their favourite exhibits.

2017-10-27_22-00-28_130

(Photos courtesy of Jim O’Donnell)

Thanks to everyone who got involved, including Fiona (Penguin Watch), Freddie (University of Oxford), Jim (Zooniverse Developer), Makayla (Illustrator), Martin (Etch A Cell), Nathan (University of Oxford) and Phil (BashTheBug), and especially all our volunteers who attended the event!

 

Introducing Panoptes Client 1.0 for Python

I’m happy to announce that the Panoptes Client package for Python has finally reached version 1.0, after nearly a year and a half of development. With this package, you can automate the management of your projects, including uploading subjects, managing subject sets, and downloading data exports.

There’s still more work to do – I have lots of additional features and improvements planned for version 1.1 – but with the release of version 1.0, the Client has a stable set of core features which are useful for managing projects (both large and small).

I know a lot of people have already been using the 0.x versions while we’ve been working on them, so thanks to everyone who submitted feature requests, bug reports, and pull requests on GitHub. Please do upgrade to the latest version to make sure you have the latest bug fixes, and keep the requests and bug reports coming!

You can find installation and upgrade instructions on GitHub, and full documentation on Read the Docs.

Six months of bashing bugs

Below is a guest blog post from Dr Philip Fowler, lead researcher on our award-winning biomedical research project Bash the Bug. Read on to find out more about this project and how you can get involved!

– Helen

 

Our bug-squishing project, BashTheBug, was six months old this month. Since launching on 7th April 2017, over seven thousand Zooniverse volunteers have contributed nearly half a million classifications between them, making 58 classifications per person, on average.

The bugs our volunteers have been bashing are the bacterium responsible for Tuberculosis (TB); ‘Mycobacterium Tuberculosis’. Many people think of TB as a disease of the past, to be found only in the books of Charles Dickens. However, the reality is quite different; TB is now responsible for more deaths each year than HIV/AIDS; in 2015 this disease killed 1.8 million people. To make matters worse, like all other bacterial diseases, TB is evolving resistance to the antibiotics used to treat it. It is this problem that inspired the BashTheBug project, which aims to improve both the diagnosis and treatment of TB.

At the heart of this project is the simple idea that, in order to find out which antibiotics are effective at killing a particular TB strain, we have to try growing that strain in the presence of a range of antibiotics at different doses. If an antibiotic stops the bacterium growing at a dose that can be used safely within the human body, then bingo! that antibiotic can be used to treat that strain. To make doing this simpler, the CRyPTIC project (which is an international consortium of TB research institutions), has designed a 96-well plate which has 14 different anti-TB drugs freeze-dried to the bottom of each well.

96well plate

Figure 1. A 96-well microtitre plate

These plates are common in science and are about the size of a large mobile phone. When a patient comes into clinic with TB, a sample of the bacterium they are infected with is taken, grown for a couple of weeks and then some is added to each of the 96 wells. The plate is then incubated for two weeks, and then examined to see which wells have TB growing in them and which do not. As each antibiotic is included on the plate at different doses, it is possible to work out the minimum concentration of antibiotic that stops the bug from growing.

But why are we doing this? Well, the genome of each TB sample will also be sequenced. This will allow us to build two large datasets; one of the mutations in the TB genome and another listing which antibiotics work for each sample (and which do not). Using these two datasets, we will then be able to infer which genetic mutations are responsible for resistance to specific antibiotics. With me still? Good. This will give researchers a large and accurate catalogue that would allow anyone to predict which antibiotics would work on any TB infection, simply by sequencing its genome. This is particularly important for the diagnosis and treatment of TB; currently used approaches are notoriously slow, taking up to eight weeks to identify which antibiotics can be used for effective treatment. If you were a clinician would you want to wait two months before starting your patient on treatment? Of course not.

Figure 2

Figure 2. A photograph of M. tuberculosis that has been growing on a plate for two weeks.

You might scoff at this point and say, pah, using genetics like this in hospitals will never happen. Well it already is. Since March 2017, all routine testing for Tuberculosis in England has been done by sequencing the genome of each sample that is sent to either of the two Public Health England reference laboratories. A report is returned to the clinician in around 9 days. Surprisingly, this costs less than the old, traditional methods for TB diagnosis and treatment. Sequencing TB samples also provides other valuable information, for example, you can compare the genomes of different infections to determine if an outbreak is underway, at no extra cost.

So far, so good. The main challenge to this project though, is size. We will be collecting around 100,000 samples from people with TB from around the world between now and 2020. Every single sample will have its genome sequenced and its susceptibility to different antibiotics tested on our 96-well plates. Each of these plates then need to be looked at, and any errors or inconsistencies in how this huge number of 96 well plates are read could lead to false conclusions about which mutations confer resistance, and which don’t.

This problem is why we need your help! You might not be clinical microbiologists (although a few of you no doubt are!) but there are many, many more of you than we have experienced and trained scientists. In fact, each plate will only be looked at by one, maybe two, scientists, and so it is highly likely that, without the help of volunteers, our final dataset will be riven with differences due to how different people in different labs have read the plates. The inconvenient truth, however much we’d like to think otherwise, is staring at a small white circle and deciding whether there is any M. tuberculosis growing or not is a highly subjective task. Take a look at the strip of wells below – the two wells in the top left have no antibiotic at all so give you an idea of how this strain of TB grows normally.

Figure 3

Figure 3. Is there a dose above which the bacteria doesn’t grow?

In the BashTheBug project, you are asked if there is a dose of antibiotic above which the antibiotic doesn’t grow. If you think there is, you are then asked the number of the first well that doesn’t have any TB growing. For the example image above, I might be cautious and say, well, I can see that there appears to be less and less growth as we go to the right and the dosage increases, but it never entirely goes away; there is a very, very faint dot in well #8. So I’m going to say that actually I think there is bacterial growth in all eight wells. You might be optimistic (or even just in a good mood) and disagree with me and say, yes, but by the time you get to well #6, that dot is so small compared to the growth in the control wells, either the antibiotic is doing its job, or, you know what, I’m not convinced that the dot isn’t some sediment or something else entirely.

There is no correct answer. We are probably both right to some extent; there IS something in well #8, but maybe this antibiotic would still be an effective treatment as it would be able to kill enough of the bacteria for your immune system to then be able to kill off the remainder of the infection. Therefore, the aim of BashTheBug is to identify which antibiotic dose multiple people agreed is the dose above which the bacteria no longer grows. Our result from this project is the consensus we get from showing each image to multiple people. Yes, the volunteers might, on average, take a slightly different view to an experienced clinical microbiologist, but that doesn’t matter as they will, on average, be consistent across all the plates which is vital if we are to uncover which genetic mutations confer resistance to antibiotics.

None of this would be possible without the hard work of all our volunteers. So, if you’ve done any classifications, thank you for all your help. Here’s to another six months, many more classifications, and the first results from the hard work done by the many volunteers who have taken part in the project to date.

Find out more:

  • Contribute to the project here
  • Read the official BashTheBug blog here
  • Follow @BashTheBug on Twitter here
  • BashTheBug won the Online Community Award of the NIHR Let’s Get Digital Competition, read more here

Check out other coverage of BashTheBug:

We took it offline and you can too! A night of Zooniverse fun at the Adler Planetarium

Our inaugural Chicago-area meetup was great fun! Zooniverse volunteers came to the Adler Planetarium, home base for our Chicago team members, to meet some of the Adler Zooniverse web development team and talk to Chicago-area researchers about their Zooniverse projects.

adler_membersnight_5
Laura Trouille, co-I for Zooniverse and Senior Director for Citizen Science at the Adler Planetarium

Presenters:

  • Zooniverse Highlights and Thank You! (Laura Trouille, co-I for Zooniverse and Senior Director for Citizen Science at the Adler Planetarium)
  • Chicago Wildlife Watch (Liza Lehrer, Assistant Director, Urban Wildlife Institute, Lincoln Park Zoo)
  • Gravity Spy (Sarah Allen, Zooniverse developer, supporting the Northwestern University LIGO team)
  • Microplants (Matt Von Konrat, Head of Botanical Collections, Field Museum)
  • Steelpan Vibrations (Andrew Morrison, Physics Professor, Joliet Junior College)
  • Wikipedia Gender Bias (Emily Temple Wood, medical student, Wikipedia Editor, Zooniverse volunteer)
  • In-Person Zooniverse Volunteer Opportunities at the Adler Planetarium (Becky Rother, Zooniverse designer)

Researchers spoke briefly about their projects and how they use the data and ideas generated by our amazing Zooniverse volunteers in their work. Emily spoke of her efforts addressing gender bias in Wikipedia. We then took questions from the audience and folks chatted in small groups afterwards.

The event coincided with Adler Planetarium’s biennial Member’s Night, so Zooniverse volunteers were able to take advantage of the museum’s “Spooky Space” themed activities at the same time, which included exploring the Adler’s spookiest collection pieces, making your own spooky space music, and other fun. A few of the Zooniverse project leads also led activities: playing Andrew’s steel pan drum, interacting with the Chicago Wildlife Watch’s camera traps and other materials, and engaging guests in classifying across the many Zooniverse projects. There was also a scavenger hunt that led Zooniverse members and Adler guests through the museum, playing on themes within the exhibit spaces relating to projects within the Zooniverse mobile app (iOS and Android).

We really enjoyed meeting our volunteers and seeing the conversation flow between volunteers and researchers. We feel so lucky to be part of this community and supporting the efforts of such passionate, interesting people who are trying to do good in the world. Thank you!

Have you hosted a Zooniverse meetup in your town? Would you like to? Let us know!

The Zooniverse responds to the Caribbean Hurricanes of 2017

The following post is by Dr Brooke Simmons, who has been leading the Zooniverse efforts to help in the aftermath of the recent Caribbean storms.

This year has seen a particularly devastating storm season. As Hurricane Irma was picking up steam and moving towards the Caribbean, we spoke to our disaster relief partners at Rescue Global and in the Machine Learning Research Group at Oxford and decided to activate the Planetary Response Network. We had previously worked with the same partners for our responses to the Nepal and Ecuador earthquakes in 2015 and 2016, and this time Rescue Global had many of the same needs: maps of expected and observed damage, and identifications of temporary settlements where displaced people might be sheltering.

The Planetary Response Network is a partnership with many people and organizations and which uses many sources of data; the Zooniverse volunteers are at its heart. The first cloud-free data available following the storm was of Guadeloupe, and our community examined pre-storm and post-storm images, marking building damage, flooding, impassable roads and signs of temporary structures. The response to our newsletter was so strong that the first set of data was classified in just 2 hours! And as more imaging has become available, we’ve processed it and released it on the project. By the time Hurricane Maria arrived in the Caribbean, Zooniverse volunteers had classified 9 different image sets from all over the Caribbean, additionally including Turks and Caicos, the Virgin Islands (US and British), and Antigua & Barbuda. That’s about 1.5 years’ worth of effort, if it was 1 person searching through these images as a full-time job. Even with a team of satellite experts it would still take much longer to analyze what the Zooniverse volunteers collectively have in just days. And there’s still more imaging: the storms aren’t over yet.

We’ve been checking in every day with Rescue Global and our Machine Learning collaborators to get feedback on how our classifications are being used and to refresh the priority list for the next set of image targets. As an example of one of those adjustments, yesterday we paused the Antigua & Barbuda dataset in order to get a rapid estimate of building density in Puerto Rico from images taken just before Irma and Maria’s arrival. We needed those because, while the algorithms used to produce the expected damage maps do incorporate external data like Census population counts and building information from OpenStreetMaps, some of that data can be incomplete or out of date (like the Census, which is an excellent resource but which is many years old now). Our volunteers collectively provided an urgently needed, uniformly-assessed and up-to-date estimate across the whole island in a matter of hours — and that data is now being used to make expected damage maps that will be delivered to Rescue Global before the post-Maria clouds have fully cleared.

Even though the project is still ongoing and we don’t have full results yet, I wanted to share some early results of the full process and the feedback we’ve been getting from responders on the ground. One of our earliest priorities was St. Thomas in the USVI, because we anticipated it would be damaged but other crowdsourcing efforts weren’t yet covering that area. From your classifications we made a raw map of damage markings. Here’s structural damage:

st_thomas_rawclicks_struct

The gray stripe was an area of clouds and some artifacts. You can get an idea from this of where there is significant damage, but it’s raw and still needs further processing. For example, in the above map, damage marked as “catastrophic” is more opaque so will look redder, but more individual markings of damage in the same place will also stack to look redder, so it’s hard to tell the difference in this visualization between 1 building out of 100 that’s destroyed and 100 buildings that all have less severe damage. The areas that had clouds and artifacts also weren’t completely unclassifiable, so there are still some markings in there that we can use to estimate what damage might be lurking under the clouds. Our Machine Learning partners incorporate these classifications and the building counts provided by our project as well as by OpenStreetMaps into a code that produces a “heat map” of structural damage that helps responders understand the probability and proportion of damage in a given area as well as how bad the damage is:

st_thomas_structural_damage_heat_map

In the heat map, the green areas are where some damage was marked, but at a low level compared to how many buildings are in the area. In the red areas, over 60% of the buildings present were marked as damaged. (Pink areas are intermediate between these.)

With volunteer classifications as inputs, we were able to deliver maps like this (and similar versions for flooding, road blockage, and temporary shelters) for every island we classified. We also incorporated other efforts like those of Tomnod to map additional islands, so that we could keep our focus on areas that hadn’t yet been covered while still providing as much accurate information to responders as possible.

Feedback from the ground has been excellent. Rescue Global has been using the maps to help inform their resource allocation, ranging from where to deliver aid packages to where to fly aerial reconnaissance missions (fuel for flights is a precious commodity, so it’s critical to know in advance which areas most need the extra follow-up). They have also shared the heat maps with other organizations providing response and aid in the area, so Zooniverse volunteers’ classifications are having an extended positive effect on efforts in the whole region. And there has been some specific feedback, too. This message came several days ago from Rebekah Yore at Rescue Global:

In addition to supplying an NGO with satellite communications on St Thomas island, the team also evacuated a small number of patients with critical healthcare needs (including a pregnant lady) to San Juan. Both missions were aided by the heat maps.

To me, this illustrates what we can all do together. Everyone has different roles to play here, from those who have a few minutes a day to contribute to those spending hours clicking and analyzing data, and certainly including those spending hours huddled over a laptop in a temporary base camp answering our emailed questions about project design and priorities while the rescue and response effort goes on around them. Without all of them, none of this would be possible.

We’re still going, now processing images taken following Hurricane Maria. But we know it’s important that our community be able to share the feedback we’ve been receiving, so even though we aren’t finished yet, we still wanted to show you this and say: thank you.

Update:

Now that the project’s active response phase has completed, we have written a further description of how the maps our volunteers helped generate were used on the project’s Results page. Additionally, every registered volunteer who contributed at least 1 classification to the project during its active phase is credited on our Team page. Together we contributed nearly 3 years’ worth of full-time effort to the response, in only 3 weeks.

Further Acknowledgments

The Planetary Response Network has been nurtured and developed by many partners and is enabled by the availability of pre- and post-event imagery. We would like to acknowledge them:

  • Firstly, our brilliant volunteers. To date on this project we have had contributions from about 10,000 unique IP addresses, of which about half are from registered Zooniverse accounts.
  • The PRN has been supported by Imperative Space and European Space Agency as part of the Crowd4Sat programme. Any views expressed on this website shall in no way be taken to represent the official opinion of ESA.
  • The development of the current Zooniverse platform has been supported by a Google Global Impact award and the Alfred P. Sloan Foundation.
  • We are grateful to Patrick Meier and QCRI for their partnership in the development of PRN.
  • We are grateful to those whose counsel (and data!) we have been fortunate to receive over the years: the Humanitarian OpenStreetMap Team, the Standby Task Force, Tomnod.
  • We are grateful to our imagery providers:
    • Planet has graciously provided images to the PRN in each of our projects. (Planet Team 2017 Planet Application Program Interface: In Space For Life on Earth. San Francisco, CA. https://api.planet.com, License: CC-BY-SA)
    • DigitalGlobe provides high-resolution imagery as part of their Open Data Program (Creative Commons Attribution Non Commercial 4.0).
    • Thanks to the USGS for making Landsat 8 images publicly available.
    • Thanks to ESA for making Sentinel-2 images publicly available.
  • Thanks to Amazon Web Services’ Open Data program for hosting Sentinel-2 and Landsat 8 images, both of which were used in this project (and sourced via AWS’ image browser and servers);
  • We’d also like to thank several individuals:
    • Everyone at Rescue Global, but particularly Hannah Pathak and Rebekah Yore for patiently answering our questions and always keeping the lines of communication open;
    • Steve Reece in Oxford’s ML group for burning the midnight oil;
    • The Zooniverse team members, who are absolute stars for jumping in and helping out at a moment’s notice.

The Universe Inside Our Cells

Below is the first in a series of guest blog posts from researchers working on one of our recently launched biomedical projects, Etch A Cell.

Read on to let Dr Martin Jones tell you about the work they’re doing to further understanding of the universe inside our cells!

– Helen

 

Having trained as a physicist, with many friends working in astronomy, I’ve been aware of Galaxy Zoo and the Zooniverse from the very early days. My early research career was in quantum mechanics, unfortunately not an area where people’s intuitions are much use! However, since I found myself working in biology labs, now at the Francis Crick Institute in London, I have been working in various aspects of microscopy – a much more visual enterprise and one where human analysis is still the gold standard. This is particularly true in electron microscopy, where the busy nature of the images means that many regions inside a cell look very similar. In order to make sense of the images, a person is able to assimilate a whole range of extra context and previous knowledge in a way that computers, for the most part, are simply unable to do. This makes it a slow and labour-intensive process. As if this wasn’t already a hard enough problem, in recent years it has been compounded by new technologies that mean the microscopes now capture images around 100 times faster than before.

Picture1
Focused ion beam scanning electron microscope

 

Ten years ago it was more or less possible to manually analyse the images at the same rate as they were acquired, keeping the in-tray and out-tray nicely balanced. Now, however, that’s not the case. To illustrate that, here’s an example of a slice through a group of cancer cells, known as HeLa cells:

Picture2

We capture an image like this and then remove a very thin layer – sometimes as thin as 5 nanometres (one nanometre is a billionth of a metre) – and then repeat… a lot! Building up enormous stacks of these images can help us understand the 3D nature of the cells and the structures inside them. For a sense of scale, this whole image is about the width of a human hair, around 80 millionths of a metre.

Zooming in to one of the cells, you can see many different structures, all of which are of interest to study in biomedical research. For this project, however, we’re just focusing on the nucleus for now. This is the large mostly empty region in the middle, where the DNA – the instruction set for building the whole body – is contained.

Picture3

By manually drawing lines around the nucleus on each slice, we can build up a 3D model that allows us to make comparisons between cells, for example understanding whether a treatment for a disease is able to stop its progression by disrupting the cells’ ability to pass on its genetic information.

Nucleus3D-1.gif

Animated gif of 3D model of a nucleus

However, images are now being generated so rapidly that the in-tray is filling too quickly for the standard “single expert” method – one sample can produce up to a terabyte of data, made up of more than a thousand 64 megapixel images captured overnight. We need new tricks!

 

Why citizen science?

With all of the advances in software that are becoming available you might think that automating image analysis of this kind would be quite straightforward for a computer. After all, people can do it relatively easily. Even pigeons can be trained in certain image analysis tasks! (http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0141357). However, there is a long history of underestimating just how hard it is to automate image analysis with a computer. Back in the very early days of artificial intelligence in 1966 at MIT, Marvin Minsky (who also invented the confocal microscope) and his colleague Seymour Papert set the “summer vision project” which they saw as a simple problem to keep their undergraduate students busy over the holidays. Many decades later we’ve discovered it’s not that easy!

Picture4

(from https://www.xkcd.com/1425/)

Our project, Etch a Cell is designed to allow citizen scientists to draw segmentations directly onto our images in the Zooniverse web interface. The first task we have set is to mark the nuclear envelope that separates the nucleus from the rest of the cell – a vital structure where defects can cause serious problems. These segmentations are extremely useful in their own right for helping us understand the structures, but citizen science offers something beyond the already lofty goal of matching the output of an expert. By allowing several people to annotate each image, we can see how the lines vary from user to user. This variability gives insight into the certainty that a given pixel or region belongs to a particular object, information that simply isn’t available from a single line drawn by one person. Difference between experts is not unheard of unfortunately!

The images below show preliminary results with the expert analysis on the left and a combination of 5 citizen scientists’ segmentations on the right.

Screen Shot 2017-06-21 at 15.29.00
Example of expert vs. citizen scientist annotation

In fact, we can go even further to maximise the value of our citizen scientists’ work. The field of machine learning, in particular deep learning, has burst onto the scene in several sectors in recent years, revolutionising many computational tasks. This new generation of image analysis techniques is much more closely aligned with how animal vision works. The catch, however, is that the “learning” part of machine learning often requires enormous amounts of time and resources (remember you’ve had a lifetime to train your brain!). To train such a system, you need a huge supply of so-called “ground truth” data, i.e. something that an expert has pre-analysed and can provide the correct answer against which the computer’s attempts are compared. Picture it as the kind of supervised learning that you did at school: perhaps working through several old exam papers in preparation for your finals. If the computer is wrong, you tweak the setup a bit and try again. By presenting thousands or even millions of images and ensuring your computer makes the same decision as the expert, you can become increasingly confident that it will make the correct decision when it sees a new piece of data. Using the power of citizen science will allow us to collect the huge amounts of data that we need to train these deep learning systems, something that would be impossible by virtually any other means.

We are now busily capturing images that we plan to upload to Etch a cell to allow us to analyse data from a range of experiments. Differences in cell type, sub-cellular organelle, microscope, sample preparation and other factors mean the images can look different across experiments, so analysing cells from a range of different conditions will allow us to build an atlas of information about sub-cellular structure. The results from Etch a cell will mean that whenever new data arrives, we can quickly extract information that will help us work towards treatments and cures for many different diseases.

Stargazing Live 2017 Recap

We recently had a very successful (and longer than usual) Stargazing Live. I wanted to talk a little about the work that our team did in the weeks leading up to this and also recap what actually happened behind the scenes during the two weeks of events.

If you’re not familiar with it, Stargazing Live is an annual astronomy TV show on BBC Two in the UK, which is broadcast live on three consecutive nights. Each year we launch a project in collaboration with the show, and this always proves to be the busiest time of our year. This year, for the first time there was a second week of shows for ABC Australia, so this time we launched two projects instead of one: Planet 9 and Exoplanet Explorers.

A lot of work went into making sure that our site stayed up for this year’s shows. In previous years we’ve had issues that have resulted in either a brief outage or reduced performance for at least some of the time during the show. This year everything worked perfectly and we actually found ourselves reducing our capacity (scaling down) much sooner than we anticipated. The prep work fell into three areas:

  • Optimisations to the frontend to reduce the number of API calls made by the site while people were using it. This involved a combination of refactoring, fixing bugs, and modifying the backend to return frequently requested data without it having to be requested separately (e.g. when checking if the user has favourited a subject).
  • Reducing the load on our databases. We reduced the number of requests that result in database queries through caching in the backend (with memcache), and we started using a new microservice (called Designator) to keep track of what each user has seen and serve them new subjects. We also separated some services onto a read replica rather than having them query the primary database.
  • Adding feature flags so that we could turn off anything non-essential, and so that we could shut down any features that were causing problems, using the Flipper Ruby gem.
The Oxford team gathers in the office to watch the show.

On the first night of the BBC show it was all hands on deck. Our teams in the US and the UK were in our offices, despite it being evening in the UK, and in Oxford we gathered around the TV expectantly awaiting the moment when Chris would announce the project’s URL on air. That moment is usually a bit frantic, as several thousand people all turn up on the site at once and start clicking around, registering, logging in, and submitting classifications. We’re always closely watching our monitoring systems, keeping an eye on various performance metrics, watching for any early signs of problems that might affect the performance of the site. This year when that moment came the number of visitors on site shot up to over 5,000, and then… everything just kept running smoothly.

The first night of the BBC show we peaked at about 0.9 million requests per hour, with 1.1 million per hour the second night.

Requests to Zooniverse.org during BBC Stargazing Live 2017.

We scaled our API service to 50 of EC2’s m3.medium instances the first night and the average CPU utilisation of these instances reached about 30% at peak traffic. The next two nights we reduced the number of instances to 40. In hindsight we could have gone even lower, but from past experience the amount of traffic we receive on the second and third nights can be difficult to predict, so we decided to play it safe.

API scaling and CPU utilisation during BBC Stargazing Live 2017.

Traffic during the ABC show was lower than during the BBC show (Australia has a smaller population than the UK, so this was as expected). That week we scaled the API to 40 instances the first night, and 20 instances for the second and third nights.

In the past we’ve had problems with running out of available connections in PostgreSQL. The connection limit depends on available memory, and we find this to be more of a problem than CPU or network constraints. During the shows we scaled the PostgreSQL instance for our main API to RDS’s m4.10xlarge and our Talk/microservices database to m4.2xlarge, primarily to give us enough leeway to avoid the connection limit. In the future we’d like to implement connection pooling to avoid this.

This was all a big improvement on previous years. While before we found ourselves extremely busy fighting fires and fixing bugs between shows, this time we had time to just relax and watch the show. We have more work to do on optimisations, because we did still have to scale up our capacity more than we’d like, but overall we’re very happy with how well things went this year.

 

Stargazing Live 2017: Thank you all!

Breaking news… Zooniverse volunteers on Exoplanet Explorers have discovered a new 4-planet system!

simoneAnimation
Computer animation of the 4-planet system. Planet orbits are to scale and planet sizes are to scale with each other, but not with the star and the size of the orbits. Credit: Simone Duca.

Congratulations to all* who directly classified the light curves for this system, bringing it to the attention of the research team. And an enormous *thank you* to the 14,000+ volunteers who provided over 2 million classifications in just three days to make this discovery possible. This is equivalent to 3.4 years of full time effort. I *heart* people-powered research! It’s also amazing how quickly we were able to get these data to the eyes of the public — the Kepler Space Satellite observed this star between December 15 and March 4, 2017.  Data arrived on Earth on March 7th and Zooniverse volunteers classified it April 3-5, 2017. I *heart* Zooniverse.

ExoplanetExplorers.org was the featured project for our inaugural ABC Australia Stargazing Live 3-day, prime-time TV event, which just ended yesterday and through which this discovery was made. Over the years, we’ve partnered with the BBC as part of their Stargazing Live event in the UK. On night 1, Chris Lintott, our intrepid leader, invites the million+ viewers to participate in that year’s featured Zooniverse project, on night 2 he highlights interesting potential results coming through the pipeline, and on night 3, if science nods in our favor, he has the pleasure of announcing exciting discoveries you all, our volunteers, have made (for example, last year’s pulsar discovery and the supernova discovery from a couple years back). 

This year we partnered with both the UK’s BBC and Australia’s ABC TV networks to run two Stargazing Live series in two weeks. We’re exhausted and exhilarated from the experience! We can imagine you all are as well (hats off to one of our volunteers who provided over 15,000 classifications in the first two days)!

Stargazing Live epitomizes many of our favorite aspects of being a member of the Zooniverse team – it’s a huge rush, filled with the highs and lows of keeping a site up when thousands of people are suddenly providing ~7000 classifications a minute at peak. We’re so proud of our web development team and their amazing effort; their smart solutions, quick thinking, and teamwork. The best part is that we collectively get to experience the joy, wonder, and discovery of the process of science right alongside the researchers. Each year the research teams leading each project have what is likely among the most inspiring (and intense) 3-days of their careers, carrying out the detective work of following up each potential discovery at breakneck speed.

planet9stats
Over 2 million classifications in just 1 day on planetninesearch.org!

talk

Brad Tucker and his team leading PlanetNineSearch.org featured in the BBC Stargazing Live event this year checked and rechecked dozens of Planet 9 candidates orbital parameters and against known object catalogs, making sure no stone was left unturned. We were bolstered throughout with re-discoveries of known objects, including many known asteroids and Chiron, a minor planet in the outer Solar System, orbiting the Sun between Saturn and Uranus.

chiron
The red, green, and blue dots in the lower left quadrant show Chiron as it moved across the Australian night sky during the Skymapper Telescope Observations for planetninesearch.org.

Even though Planet 9 hasn’t been discovered yet, it’s huge progress for that field of research to have completed a thorough search through this Skymapper dataset, which allows us to probe out to certain distances and sizes of objects across a huge swath of the sky. Stay tuned for progress at planetninesearch.org and through the related BackyardWorlds.org project, searching a different parameter space for Planet 9 in WISE data.

Also, and very importantly, the BBC Stargazing Live shows gave the world an essential new member of the Twitterverse:

liftoff-3
Understanding this inside joke alone makes it worth watching the show!

The Exoplanet Explorers team, led by Ian Crossfield, Jessie Christiansen, Geert Barentsen, Tom Barclay, and more were also up through much of each night of the event this week, churning through the results. Because the Kepler Space Telescope K2 dataset is so rich, there were dozens of potential candidates to triple check in just 3 days. Not only did our volunteers discover the 4-planet system shown above, but 90 new and true candidate exoplanets! That’s truly an amazing start to a project.

gumballs
Chris Lintott shows Brian Cox and Julia Zemiro the possible planets we’ve found so far, using the nearby town’s entire stock of gumballs. 

Once you all, our amazing community, have classified all the images in this project and the related PlanetHunters.org, the researchers will be able to measure the occurrence rates of different types of planets orbiting different types of stars. They’ll use this information to answer questions like — Are small planets (like Venus) more common than big ones (like Saturn)? Are short-period planets (like Mercury) more common than those on long orbits (like Mars)? Do planets more commonly occur around stars like the Sun, or around the more numerous, cooler, smaller “red dwarfs”?

There’s also so much to learn about the 4-planet system itself. It’s particularly interesting because it’s such a compact system (all orbits are well within Mercury’s distance to our Sun) of potentially rocky planets. If these characteristics hold true, we expect they will put planet formation theories to the test.

A fun part of our effort for the show was to create visualizations for this newly discovered system. Simone, one of our developers, used http://codepen.io/anon/pen/RpOYRw to create the simulation shown above. We welcome all to try their hand using this tool or others to create their favorite visualization of the system. Do post your effort in the comments below. To set you on the right path, here are our best estimates for the system so far:

Fun facts:

  • In 2372 years, on July 9, 4388AD, all four planets will transit at the same time.
  • If you’re standing on planet e, the nearest planet would appear bigger than the full moon on the sky. Apparent size of other planets while standing on e = 10 arcmin, 16 arcmin, 32 arcmin.
  • If you’re on planet e, the star barely appears to rotate: you see the same side of it for many “years,” because the star rotates just as quickly as planet “e” goes around it.

This post wouldn’t be complete without a thank you to Edward Gomez for following up candidates with the Los Cumbres Observatory Robotic Telescope Network. Not only is LCO a great research tool, but it provides amazing access to telescopes and quality curricular materials for students around the world.

*And a special thanks to the following volunteers who correctly identified at least one the planets in the newly discovered 4-planet system:
Joshua Kusch
Edward Heaps
Ivan Terentev
TimothyCatron
James Richmond
Alan Patricio Zetina Floresmarhx
sankalp mohan
seamonkeyluv
traumeule
B Butler
Nicholas Sloan
Kerrie Ryan
Huskynator
Lee Mason
Trudy Frankensteiner
Alan Goldsmith
Gavin Condon
Simon Wilde
Sharon McGuire
helenatgzoo
Melina Thévenot
Niamh Claydon-Mullins
ellieoban
Anastasios D. Papanastasiou
AndyGrey
Angela Crow
Dave Williams
Throbulator
Tim Smith
Erin Thomas
Valentina Saavedra
Carole Riley
sidy2001
bn3
ilgiz
Antonio Pasqua
Peter Bergvall
Stephen Hippisley
sidy2001
bn3
Michael Sarich

Studying the Impact of the Zooniverse

Below is a guest post from a researcher who has been studying the Zooniverse and who just published a paper called ‘Crowdsourced Science: Sociotechnical epistemology in the e-research paradigm’. That being a bit of a mouthful, I asked him to introduce himself and explain – Chris.

My name is David Watson and I’m a data scientist at Queen Mary University of London’s Centre for Translational Bioinformatics. As an MSc student at the Oxford Internet Institute back in 2015, I wrote my thesis on crowdsourcing in the natural sciences. I got in touch with several members of the Zooniverse team, who were kind enough to answer all my questions (I had quite a lot!) and even provide me with an invaluable dataset of aggregated transaction logs from 2014. Combining this information with publication data from a variety of sources, I examined the impact of crowdsourcing on knowledge production across the sciences.

Last week, the philosophy journal Synthese published a (significantly) revised version of my thesis, co-authored by my advisor Prof. Luciano Floridi. We found that Zooniverse projects not only processed far more observations than comparable studies conducted via more traditional methods—about an order of magnitude more data per study on average—but that the resultant papers vastly outperformed others by researchers using conventional means. Employing the formal tools of Bayesian confirmation theory along with statistical evidence from and about Zooniverse, we concluded that crowdsourced science is more reliable, scalable, and connective than alternative methods when certain common criteria are met.

In a sense, this shouldn’t really be news. We’ve known for over 200 years that groups are usually better than individuals at making accurate judgments (thanks, Marie Jean Antoine Nicolas de Caritat, aka Marquis de Condorcet!) The wisdom of crowds has been responsible for major breakthroughs in software development, event forecasting, and knowledge aggregation. Modern science has become increasingly dominated by large scale projects that pool the labour and expertise of vast numbers of researchers.

We were surprised by several things in our research, however. First, the significance of the disparity between the performance of publications by Zooniverse and those by other labs was greater than expected. This plot represents the distribution of citation percentiles by year and data source for articles by both groups. Statistical tests confirm what your eyes already suspect—it ain’t even close.

Influence of Zooniverse Articles

We were also impressed by the networks that appear in Zooniverse projects, which allow users to confer with one another and direct expert attention toward particularly anomalous observations. In several instances this design has resulted in patterns of discovery, in which users flag rare data that go on to become the topic of new projects. This structural innovation indicates a difference not just of degree but of kind between so-called “big science” and crowdsourced e-research.

If you’re curious to learn more about our study of Zooniverse and the site’s implications for sociotechnical epistemology, check out our complete article.