Hi all, I am Coleman Krawczyk and for the past year I have been working on tools to help Zooniverse research teams work with their data exports. The current version of the code (v1.3.0) supports data aggregation for nearly all the project builder task types, and support will be added for the remaining task types in the coming months.
What does this code do?
This code provides tools to allow research teams to process and aggregate classifications made on their project, or in other words, this code calculates the consensus answer for a given subject based on the volunteer classifications.
The code is written in python, but it can be run completely using three command line scripts (no python knowledge needed) and a project’s data exports.
The first script is the uses a project’s workflow data export to auto-configure what extractors and reducers (see below) should be run for each task in the workflow. This produces a series of `yaml` configuration files with reasonable default values selected.
Next the extraction script takes the classification data export and flattens it into a series of `csv` files, one for each unique task type, that only contain the data needed for the reduction process. Although the code tries its best to produce completely “flat” data tables, this is not always possible, so more complex tasks (e.g. drawing tasks) have structured data for some columns.
The final script takes the results of the data extraction and combine them into a single consensus result for each subject and each task (e.g. vote counts, clustered shapes, etc…). For more complex tasks (e.g. drawing tasks) the reducer’s configuration file accepts parameters to help tune the aggregation algorithms to best work with the data at hand.
At the moment this code is provided in its “offline” form, but we testing ways for this aggregation to be run “live” on a Zooniverse project. When that system is finished a research team will be able to enter their configuration parameters directly in the project builder, a server will run the aggregation code, and the extracted or reduced `csv` files will be made available for download.
Occasionally we run studies in collaboration with external researchers in order to better understand our community and improve our platform. These can involve methods such as A/B splits, where we show a slightly different version of the site to one group of volunteers and measure how it affects their participation, e.g. does it influence how many classifications they make or their likelihood to return to the project for subsequent sessions?
One example of such a study was the messaging experiment we ran on Galaxy Zoo. We worked with researchers from Ben Gurion University and Microsoft research to test if the specific content and timing of messages presented in the classification interface could help alleviate the issue of volunteers disengaging from the project. You can read more about that experiment and its results in this Galaxy Zoo blog post https://blog.galaxyzoo.org/2018/07/12/galaxy-zoo-messaging-experiment-results/.
As the Zooniverse has different teams based at different institutions in the UK and the USA, the procedures for ethics approval differ depending on who is leading the study. After recent discussions with staff at the University of Oxford ethics board, to check our procedure was up to date, our Oxford-based team will be changing the way in which we gain approval for, and report the completion of these types of studies. All future study designs which feature Oxford staff taking part in the analysis will be submitted to CUREC, something we’ve been doing for the last few years. From now on, once the data gathering stage of the study has been run we will provide all volunteers involved with a debrief message.
The debrief will explain to our volunteers that they have been involved in a study, along with providing information about the exact set-up of the study and what the research goals were. The most significant change is that, before the data analysis is conducted, we will contact all volunteers involved in the study allow a period of time for them to state that they would like to withdraw their consent to the use of their data. We will then remove all data associated with any volunteer who would not like to be involved before the data is analysed and the findings are presented. The debrief will also contain contact details for the researchers in the event of any concerns and complaints. You can see an example of such a debrief in our original post about the Galaxy Zoo messaging experiment here https://blog.galaxyzoo.org/2015/08/10/messaging-test/.
As always, our primary focus is the research being enabled by our volunteer community on our individual projects. We run experiments like these in order to better understand how to create a more efficient and productive platform that benefits both our volunteers and the researchers we support. All clicks that are made by our volunteers are used in the science outcomes from our projects no matter whether they are part of an A/B split experiment or not. We still strive never to waste any volunteer time or effort.
We thank you for all that you do, and for helping us learn how to build a better Zooniverse.
Part three in a multi-part series exploring the visual and UX changes to the Zooniverse classify interface
Today we’ll be going over a couple of visual changes to familiar elements of the classify interface and new additions we’re excited to premier. These updates haven’t been implemented yet, so nothing is set in stone. Please use this survey to send me feedback about these or any of the other updates to the Zooniverse.
Many respondents to my 2017 design survey requested that they be able to use the keyboard to make classifications rather than having to click so many buttons. One volunteer actually called the classifier “a carpal-tunnel torturing device”. As a designer, that’s hard to hear – it’s never the goal to actively injure our volunteers.
We actually do support keyboard shortcuts! This survey helped us realize that we need to be better at sharing some of the tools our developers have built. The image above shows a newly designed Keyboard Shortcut information modal. This modal (or “popup”) is a great example of a few of the modals we’re building – you can leave it open and drag it around the interface while you work, so you’ll be able to quickly refer to it whenever you need.
This behavior will be mirrored in a few of the modals that are currently available to you:
Add to Favorites
Add to Collection / Create a New Collection
It will also be applied to a few new ones, including…
Another major finding from the design survey was that users did not have a clear idea where to go when they needed help with a task (see chart below).
We know research teams often put a lot of effort into their help texts, and we wanted to be sure that work was reaching the largest possible audience. Hence, we moved the Field Guide from a small button on the right-hand side of the screen – a place that can become obscured by the browser’s scrollbar – and created a larger, more prominent button in the updated toolbar:
By placing the Field Guide button in a more prominent position and allowing the modal to stay open during classifications, we hope this tool will be taken advantage of more than it currently is.
The layout was the result of the audit of every live project I conducted in spring 2017:
Mode item count
Mode label word count
Min item count
Min label word count
Max items count
Max label word count
Using the mode gave me the basis on which to design; however, there’s quite a disparity between min and max amounts. Because of this disparity, we’ll be giving project owners with currently active projects a lot of warning before switching to the new layout, and they’ll have the option to continue to use the current Field Guide design if they’d prefer.
Another major resource Zooniverse offers its research teams and volunteers is the Tutorial. Often used to explain project goals, welcome new volunteers to the project, and point out what to look for in an image, the current tutorial is often a challenge because its absolute positioning on top of the subject image.
In this iteration of the classify interface, the tutorial opens once as a modal, just as it does now, and then lives in a tab in the task area where it’s much more easily accessible. You’ll be able to switch to the Tutorial tab in order to compare the example images and information with the subject image you’re looking at, rather than opening and closing the tutorial box many times.
A brand-new statistics section
Another major comment from the survey was that volunteers wanted more ways to interact with the Zooniverse. Thus, you’ll be able to scroll down to find a brand-new section! Features we’re adding will include:
Your previous classifications with Add to Favorites or Add to Collection buttons
Interesting stats, like the amount of classifications you’ve done and the amount of classifications your community have done
Links to similar projects you might be interested in
Links to the project’s blog and social media to help you feel more connected to the research team
Links to the project’s Talk boards, for a similar purpose
Possibly: A way to indicate that you’re finished for the day, giving you the option to share your experience on social media or find another project you’re interested in.
The statistics we chose were directly related to the responses from the survey:
Respondents were able to choose more than one response; when asked to rank them in order of importance, project-wide statistics were chosen hands-down:
We also heard that volunteers sometimes felt disconnected from research teams and the project’s accomplishments:
“In general there is too less information about the achievement of completed projects. Even simple facts could cause a bit of a success-feeling… how many pictures in this project over all have been classified? How much time did it take? How many hours were invested by all participating citizens? Were there any surprising things for the scientists? Things like that could be reported long before the task of a project is completely fullfilled.”
Research teams often spend hours engaged in dialog with volunteers on Talk, but not everyone who volunteers on Zooniverse is aware or active on Talk. Adding a module on the classify page showing recent Talk posts will bring more awareness to this amazing resource and hopefully encourage more engagement from volunteers.
Templates for different image sizes and dimensions
When the project builder was created, we couldn’t have predicted the variety of disparate topics that would become Zooniverse projects. Originally, the subject viewer was designed for one common image size, roughly 2×3, and other sizes have since been shoehorned in to fit as well as they can.
Now, we’d like to make it easier for subjects with extreme dimensions, multimedia subjects, and multi-image subjects to fit better within the project builder. By specifically designing templates and allowing project owners to choose the one that best fits their subjects, volunteers and project owners alike will have a better experience.
Very wide subjects will see their toolbar moved to the bottom of the image rather than on the right, to give the image as much horizontal space as possible. Tall subjects will be about the same width as they have been, but the task/tutorial box will stay fixed on the screen as the image scrolls, eliminating the need to scroll up and down as often when looking at the bottom of the subject.
Let’s get started!
I’m so excited for the opportunity to share a preview of these changes with you. Zooniverse is a collaborative project, so if there’s anything you’d like us to address as we implement this update, please use this survey to share your thoughts and suggestions. Since we’re rolling these out in pieces, it will be much easier for us to be able to iterate, test, and make changes.
We estimate that the updates will be mostly in place by early 2019, so there’s plenty of time to make sure we’re creating the best possible experience for everyone.
Thank you so much for your patience and understanding as we move forward. In the future, we’ll be as open and transparent as possible about this process.
Part two in a multi-part series exploring the visual and UX changes to the Zooniverse classify interface
Today and in the next post, we’ll take a look at the reasoning behind specific changes to the classifier that we’ve already started to roll out over the past few months. We’ve had good discussions on Talk about many of the updates, but I wanted to reiterate those conversations here so there’s just one source of information to refer back to in the future.
In case you missed it, the first blog post in this series previews the complete new classify layout.
As a reminder, if you have feedback about these changes or anything else on the site you’d like to see addressed, please use this survey link.
We started with a rethinking of each project’s navigation bar. The new design features cleaner typography, a more prominent project title, and visual distinction from the sitewide navigation. It also includes the project’s home page background image, giving the project visual distinction while keeping the classify interface itself clean and legible. It’s also responsive: on smaller screen heights, the height of the navigation bar adjusts accordingly.
The most important goal we solved in making this change was to separate the project navigation from the site navigation. During my initial site research and in talking to colleagues and volunteers, many found it difficult to distinguish between the two navigations. Adding a background, a distinct font style, and moving the options to the right side of the page accomplishes this goal.
In conjunction with adding the background image to the navigation bar, the background image was removed from the main classify interface. It was replaced with a cool light grey, followed quickly by the dark grey of the Dark Theme.
Legibility is one of the main goals of any web designer, and it was the focus of this update. By moving to clean greys, all of the focus is now on the subject and task. There are some really striking subject images on Zooniverse, from images of the surface of Mars to zebras in their natural habitat. We want to make sure these images are front and center rather than getting lost within the background image.
The Dark Theme was a suggestion from a Zooniverse researcher – they pointed out that some subject images are similar in tone to the light grey, so a darker theme was added to make sure contrast would be enough to make the image “pop”. We love suggestions like this! While the team strives to be familiar with every Zooniverse project, the task is sometimes beyond us, so we rely on our researchers and volunteers to point out anomalies like this. If you find something like this, you can use this survey to bring it to my attention.
Another great suggestion from a Zooniverse volunteer was the addition of the project name on the left side of the screen. This hasn’t been implemented yet, but it’s a great way to help with wayfinding if the interface is scrolled to below the navigation bar.
Updated task section
By enclosing the task and its responses in a box rather than leaving it floating in space, the interface gives a volunteer an obvious place to look for the task across every project. Adjusting the typography elevates the interface and helps it feel more professional.
One of the most frequent comments we heard in the 2017 survey was that the interface had far too much scrolling – either the subject image or the task area was too tall. The subject image height will be addressed at a later date, but this new task area was designed specifically with scrolling in mind.
I used the averages I found in my initial project audit and the average screen height (643 px) based on Google Analytics data from the same time period to design a task area that would comfortably fit on screen without scrolling. It’s important to note that there are always outliers in large-scale sites like Zooniverse. While using averages is the best way to design for most projects, we know we can’t provide the most optimal experience for every use case.
You’ll also notice the secondary “Tutorial” tab to the right of the “Task” label. This is a feature that’s yet to be implemented, and I’ll talk more about it in the next post.
And more to come
The next installments in this series will address the additional updates we have planned, like updated modals and a whole new stats section.
Part one in a multi-part series exploring the visual and UX changes to the Zooniverse classify interface
First, an introduction.
Zooniverse began in 2007, with a galaxy-classifying project called Galaxy Zoo. The project was wildly successful, and one of the lead researchers, Chris Lintott, saw an opportunity to help other researchers accomplish similar goals. He assembled a team of developers and set to work building custom projects just like Galaxy Zoo for researchers around the world.
And things were good.
But the team started to wonder: How can we improve the process to empower researchers to build their own Zooniverse projects, rather than relying on the team’s limited resources to build their projects for them?
In the first year of its inception, the number of projects available to citizen scientist volunteers nearly doubled. Popularity spread, the team grew, and things seemed to be going well.
That’s where I come in. * Record scratch *
Three years after the project builder’s debut, I was hired as the Zooniverse designer. With eight years’ experience in a variety of design roles from newspaper page design to user experience for mobile apps to web design, I approached the new project builder-built projects with fresh eyes, taking a hard look at what was working and what areas could be improved.
Over the next week, I’ll be breaking down my findings and observations, and talking through the design changes we’re making, shedding more light on the aims and intentions behind these changes and how they will affect your experience on the Zooniverse platform.
If you take one thing away from this series it’s that this design update, in following with the ethos of Zooniverse, is an iterative, collaborative process. These posts represent where we are now, in June 2018, but the final product, after testing and hearing your input, may be different. We’re learning as we go, and your input is hugely beneficial as we move forward.
Here’s a link to an open survey in case you’d like to share thoughts, experiences, or opinions at any point.
Let’s dive in.
Part one: Research
My first few weeks on the job were spent exploring Zooniverse, learning about the amazing world of citizen science, and examining projects with similar task types from across the internet.
I did a large-scale analysis of the site in general, going through every page in each section and identifying areas with inconsistent visual styles or confusing user experiences.
After my initial site analysis, I created a list of potential pages or sections that were good candidates for a redesign. The classify interface stood out as the best place to start, so I got to work.
Visual design research
First, I identified areas of the interface that could use visual updates. My main concerns were legibility, accessibility, and varying screen sizes. With an audience reaching to the tens of thousands per week, the demographic diversity makes for an interesting design challenge.
Next, I conducted a comprehensive audit of every project that existed on the Zooniverse in March 2017 (79 in total, including custom projects like Galaxy Zoo), counting question/task word count, the max number of answers, subject image dimensions, field guide content, and a host of other data points. That way, I could accurately design for the medians rather than choosing arbitrarily. When working on this scale, it’s important to use data like these to ensure that the largest possible group is well designed for.
Here are some selected data:
Task type: Drawing
Average number of possible answers
Answer average max word count
Answer max max word count
Answer min max word count
Answer median max word count
Number with thumbnail images
Task type: Question
Average number of possible answers
Answer average max word count
Answer max max word count
Answer min max word count
Answer median max word count
Number with thumbnail images
Task type: Survey
Average number of possible answers
Answer average max word count
Answer max max word count
Answer min max word count
Answer median max word count
Number with thumbnail images
Even More Research
Next, I focused on usability. To ensure that I understood issues from as many perspectives as possible, I sent a design survey to our beta testers mailing list, comprising about 100,000 volunteers (if you’re not already on the list, you can opt in via your Zooniverse email settings). Almost 1,200 people responded, and those responses informed the decisions I made and helped prioritize areas of improvement.
Here are the major findings from that survey:
No consensus on where to go when you’re not sure how to complete a task.
Many different destinations after finishing a task.
Too much scrolling and mouse movement.
Lack of keyboard shortcuts.
Would like the ability to view previous classifications.
Translations to more languages.
Need for feedback when doing classifications.
Finding new projects that might also be interesting.
In the next few blog posts, I’ll be breaking down specific features of the update and showing how these survey findings help inform the creation of many of the new features.
Without further ado
Some of these updates will look familiar, as we’ve already started to implement style and layout adjustments. I’ll go into more detail in subsequent posts, but at a high level, these changes seek to improve your overall experience classifying on the site no matter where you are, what browser you’re using, or what type of project you’re working on.
Visually, the site is cleaner and more professional, a reflection of Zooniverse’s standing in the citizen science community and of the real scientific research that’s being done. Studieshaveshown that good, thoughtful design influences a visitor’s perceptions of a website or product, sometimes obviously, sometimes at a subliminal level. By making thoughtful choices in the design of our site, we can seek to positively affect audience perceptions about Zooniverse, giving volunteers and researchers even more of a reason to feel proud of the projects they’re passionate about.
It’s important to note that this image is a reflection of our current thought, in June 2018, but as we continue to test and get feedback on the updates, the final design may change. One benefit to rolling updates out in pieces is the ability to quickly iterate ideas until the best solution is found.
We estimate that the updates will be mostly in place by early 2019.
This is due in part to the size of our team. At most, there are about three people working on these updates while also maintaining our commitments to other grant-funded projects and additional internal projects. The simple truth is that we just don’t have the resources to be able to devote anyone full-time to this update.
The timeline is also influenced in a large part by the other half of this update: A complete overhaul of the infrastructure of the classifier. These changes aren’t as visible, but you’ll notice an improvement in speed and functionality that is just as important as the “facelift” portion of the update.
We’ve seen your feedback on Talk, via email, and on Github, and we’re happy to keep a dialog going about subsequent updates. To streamline everything and make sure your comments don’t get missed, please only use this survey link to post thoughts moving forward.
Our inaugural Chicago-area meetup was great fun! Zooniverse volunteers came to the Adler Planetarium, home base for our Chicago team members, to meet some of the Adler Zooniverse web development team and talk to Chicago-area researchers about their Zooniverse projects.
Zooniverse Highlights and Thank You! (Laura Trouille, co-I for Zooniverse and Senior Director for Citizen Science at the Adler Planetarium)
In-Person Zooniverse Volunteer Opportunities at the Adler Planetarium (Becky Rother, Zooniverse designer)
Researchers spoke briefly about their projects and how they use the data and ideas generated by our amazing Zooniverse volunteers in their work. Emily spoke of her efforts addressing gender bias in Wikipedia. We then took questions from the audience and folks chatted in small groups afterwards.
Zooniverse (Laura Trouille)
Chicago Wildlife Watch (Liza Lehrer, Assistant Director, Urban Wildlife Institute, Lincoln Park Zoo)
Gravity Spy (Sarah Allen, Zooniverse developer, supporting the Northwestern University LIGO team)
Microplants (Matt Von Konrat, Head of Botanical Collections, Field Museum)
Steelpan Vibrations (Andrew Morrison, Physics Professor, Joliet Junior College)
Wikipedia Gender Bias (Emily Temple Wood, medical student, Wikipedia Editor, Zooniverse volunteer)
The event coincided with Adler Planetarium’s biennial Member’s Night, so Zooniverse volunteers were able to take advantage of the museum’s “Spooky Space” themed activities at the same time, which included exploring the Adler’s spookiest collection pieces, making your own spooky space music, and other fun. A few of the Zooniverse project leads also led activities: playing Andrew’s steel pan drum, interacting with the Chicago Wildlife Watch’s camera traps and other materials, and engaging guests in classifying across the many Zooniverse projects. There was also a scavenger hunt that led Zooniverse members and Adler guests through the museum, playing on themes within the exhibit spaces relating to projects within the Zooniverse mobile app (iOS and Android).
We really enjoyed meeting our volunteers and seeing the conversation flow between volunteers and researchers. We feel so lucky to be part of this community and supporting the efforts of such passionate, interesting people who are trying to do good in the world. Thank you!
Have you hosted a Zooniverse meetup in your town? Would you like to? Let us know!
The following post is by Dr Brooke Simmons, who has been leading the Zooniverse efforts to help in the aftermath of the recent Caribbean storms.
This year has seen a particularly devastating storm season. As Hurricane Irma was picking up steam and moving towards the Caribbean, we spoke to our disaster relief partners at Rescue Global and in the Machine Learning Research Group at Oxford and decided to activate the Planetary Response Network. We had previously worked with the same partners for our responses to the Nepal and Ecuador earthquakes in 2015 and 2016, and this time Rescue Global had many of the same needs: maps of expected and observed damage, and identifications of temporary settlements where displaced people might be sheltering.
The Planetary Response Network is a partnership with many people and organizations and which uses many sources of data; the Zooniverse volunteers are at its heart. The first cloud-free data available following the storm was of Guadeloupe, and our community examined pre-storm and post-storm images, marking building damage, flooding, impassable roads and signs of temporary structures. The response to our newsletter was so strong that the first set of data was classified in just 2 hours! And as more imaging has become available, we’ve processed it and released it on the project. By the time Hurricane Maria arrived in the Caribbean, Zooniverse volunteers had classified 9 different image sets from all over the Caribbean, additionally including Turks and Caicos, the VirginIslands (US and British), and Antigua & Barbuda. That’s about 1.5 years’ worth of effort, if it was 1 person searching through these images as a full-time job. Even with a team of satellite experts it would still take much longer to analyze what the Zooniverse volunteers collectively have in just days. And there’s still more imaging: the storms aren’t over yet.
We’ve been checking in every day with Rescue Global and our Machine Learning collaborators to get feedback on how our classifications are being used and to refresh the priority list for the next set of image targets. As an example of one of those adjustments, yesterday we paused the Antigua & Barbuda dataset in order to get a rapid estimate of building density in Puerto Rico from images taken just before Irma and Maria’s arrival. We needed those because, while the algorithms used to produce the expected damage maps do incorporate external data like Census population counts and building information from OpenStreetMaps, some of that data can be incomplete or out of date (like the Census, which is an excellent resource but which is many years old now). Our volunteers collectively provided an urgently needed, uniformly-assessed and up-to-date estimate across the whole island in a matter of hours — and that data is now being used to make expected damage maps that will be delivered to Rescue Global before the post-Maria clouds have fully cleared.
Even though the project is still ongoing and we don’t have full results yet, I wanted to share some early results of the full process and the feedback we’ve been getting from responders on the ground. One of our earliest priorities was St. Thomas in the USVI, because we anticipated it would be damaged but other crowdsourcing efforts weren’t yet covering that area. From your classifications we made a raw map of damage markings. Here’s structural damage:
The gray stripe was an area of clouds and some artifacts. You can get an idea from this of where there is significant damage, but it’s raw and still needs further processing. For example, in the above map, damage marked as “catastrophic” is more opaque so will look redder, but more individual markings of damage in the same place will also stack to look redder, so it’s hard to tell the difference in this visualization between 1 building out of 100 that’s destroyed and 100 buildings that all have less severe damage. The areas that had clouds and artifacts also weren’t completely unclassifiable, so there are still some markings in there that we can use to estimate what damage might be lurking under the clouds. Our Machine Learning partners incorporate these classifications and the building counts provided by our project as well as by OpenStreetMaps into a code that produces a “heat map” of structural damage that helps responders understand the probability and proportion of damage in a given area as well as how bad the damage is:
In the heat map, the green areas are where some damage was marked, but at a low level compared to how many buildings are in the area. In the red areas, over 60% of the buildings present were marked as damaged. (Pink areas are intermediate between these.)
With volunteer classifications as inputs, we were able to deliver maps like this (and similar versions for flooding, road blockage, and temporary shelters) for every island we classified. We also incorporated other efforts like those of Tomnod to map additional islands, so that we could keep our focus on areas that hadn’t yet been covered while still providing as much accurate information to responders as possible.
Feedback from the ground has been excellent. Rescue Global has been using the maps to help inform their resource allocation, ranging from where to deliver aid packages to where to fly aerial reconnaissance missions (fuel for flights is a precious commodity, so it’s critical to know in advance which areas most need the extra follow-up). They have also shared the heat maps with other organizations providing response and aid in the area, so Zooniverse volunteers’ classifications are having an extended positive effect on efforts in the whole region. And there has been some specific feedback, too. This message came several days ago from Rebekah Yore at Rescue Global:
In addition to supplying an NGO with satellite communications on St Thomas island, the team also evacuated a small number of patients with critical healthcare needs (including a pregnant lady) to San Juan. Both missions were aided by the heat maps.
To me, this illustrates what we can all do together. Everyone has different roles to play here, from those who have a few minutes a day to contribute to those spending hours clicking and analyzing data, and certainly including those spending hours huddled over a laptop in a temporary base camp answering our emailed questions about project design and priorities while the rescue and response effort goes on around them. Without all of them, none of this would be possible.
We’re still going, now processing images taken following Hurricane Maria. But we know it’s important that our community be able to share the feedback we’ve been receiving, so even though we aren’t finished yet, we still wanted to show you this and say: thank you.
Now that the project’s active response phase has completed, we have written a further description of how the maps our volunteers helped generate were used on the project’s Results page. Additionally, every registered volunteer who contributed at least 1 classification to the project during its active phase is credited on our Team page. Together we contributed nearly 3 years’ worth of full-time effort to the response, in only 3 weeks.
The Planetary Response Network has been nurtured and developed by many partners and is enabled by the availability of pre- and post-event imagery. We would like to acknowledge them:
Firstly, our brilliant volunteers. To date on this project we have had contributions from about 10,000 unique IP addresses, of which about half are from registered Zooniverse accounts.
Planet has graciously provided images to the PRN in each of our projects. (Planet Team 2017 Planet Application Program Interface: In Space For Life on Earth. San Francisco, CA. https://api.planet.com, License: CC-BY-SA)
DigitalGlobe provides high-resolution imagery as part of their Open Data Program (Creative Commons Attribution Non Commercial 4.0).
Thanks to the USGS for making Landsat 8 images publicly available.
Thanks to ESA for making Sentinel-2 images publicly available.
Below is the first in a series of guest blog posts from researchers working on one of our recently launched biomedical projects, Etch A Cell.
Read on to let Dr Martin Jones tell you about the work they’re doing to further understanding of the universe inside our cells!
Having trained as a physicist, with many friends working in astronomy, I’ve been aware of Galaxy Zoo and the Zooniverse from the very early days. My early research career was in quantum mechanics, unfortunately not an area where people’s intuitions are much use! However, since I found myself working in biology labs, now at the Francis Crick Institute in London, I have been working in various aspects of microscopy – a much more visual enterprise and one where human analysis is still the gold standard. This is particularly true in electron microscopy, where the busy nature of the images means that many regions inside a cell look very similar. In order to make sense of the images, a person is able to assimilate a whole range of extra context and previous knowledge in a way that computers, for the most part, are simply unable to do. This makes it a slow and labour-intensive process. As if this wasn’t already a hard enough problem, in recent years it has been compounded by new technologies that mean the microscopes now capture images around 100 times faster than before.
Focused ion beam scanning electron microscope
Ten years ago it was more or less possible to manually analyse the images at the same rate as they were acquired, keeping the in-tray and out-tray nicely balanced. Now, however, that’s not the case. To illustrate that, here’s an example of a slice through a group of cancer cells, known as HeLa cells:
We capture an image like this and then remove a very thin layer – sometimes as thin as 5 nanometres (one nanometre is a billionth of a metre) – and then repeat… a lot! Building up enormous stacks of these images can help us understand the 3D nature of the cells and the structures inside them. For a sense of scale, this whole image is about the width of a human hair, around 80 millionths of a metre.
Zooming in to one of the cells, you can see many different structures, all of which are of interest to study in biomedical research. For this project, however, we’re just focusing on the nucleus for now. This is the large mostly empty region in the middle, where the DNA – the instruction set for building the whole body – is contained.
By manually drawing lines around the nucleus on each slice, we can build up a 3D model that allows us to make comparisons between cells, for example understanding whether a treatment for a disease is able to stop its progression by disrupting the cells’ ability to pass on its genetic information.
Animated gif of 3D model of a nucleus
However, images are now being generated so rapidly that the in-tray is filling too quickly for the standard “single expert” method – one sample can produce up to a terabyte of data, made up of more than a thousand 64 megapixel images captured overnight. We need new tricks!
Why citizen science?
With all of the advances in software that are becoming available you might think that automating image analysis of this kind would be quite straightforward for a computer. After all, people can do it relatively easily. Even pigeons can be trained in certain image analysis tasks! (http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0141357). However, there is a long history of underestimating just how hard it is to automate image analysis with a computer. Back in the very early days of artificial intelligence in 1966 at MIT, Marvin Minsky (who also invented the confocal microscope) and his colleague Seymour Papert set the “summer vision project” which they saw as a simple problem to keep their undergraduate students busy over the holidays. Many decades later we’ve discovered it’s not that easy!
Our project, Etch a Cellis designed to allow citizen scientists to draw segmentations directly onto our images in the Zooniverse web interface. The first task we have set is to mark the nuclear envelope that separates the nucleus from the rest of the cell – a vital structure where defects can cause serious problems. These segmentations are extremely useful in their own right for helping us understand the structures, but citizen science offers something beyond the already lofty goal of matching the output of an expert. By allowing several people to annotate each image, we can see how the lines vary from user to user. This variability gives insight into the certainty that a given pixel or region belongs to a particular object, information that simply isn’t available from a single line drawn by one person. Difference between experts is not unheard of unfortunately!
The images below show preliminary results with the expert analysis on the left and a combination of 5 citizen scientists’ segmentations on the right.
Example of expert vs. citizen scientist annotation
In fact, we can go even further to maximise the value of our citizen scientists’ work. The field of machine learning, in particular deep learning, has burst onto the scene in several sectors in recent years, revolutionising many computational tasks. This new generation of image analysis techniques is much more closely aligned with how animal vision works. The catch, however, is that the “learning” part of machine learning often requires enormous amounts of time and resources (remember you’ve had a lifetime to train your brain!). To train such a system, you need a huge supply of so-called “ground truth” data, i.e. something that an expert has pre-analysed and can provide the correct answer against which the computer’s attempts are compared. Picture it as the kind of supervised learning that you did at school: perhaps working through several old exam papers in preparation for your finals. If the computer is wrong, you tweak the setup a bit and try again. By presenting thousands or even millions of images and ensuring your computer makes the same decision as the expert, you can become increasingly confident that it will make the correct decision when it sees a new piece of data. Using the power of citizen science will allow us to collect the huge amounts of data that we need to train these deep learning systems, something that would be impossible by virtually any other means.
We are now busily capturing images that we plan to upload to Etch a cell to allow us to analyse data from a range of experiments. Differences in cell type, sub-cellular organelle, microscope, sample preparation and other factors mean the images can look different across experiments, so analysing cells from a range of different conditions will allow us to build an atlas of information about sub-cellular structure. The results from Etch a cell will mean that whenever new data arrives, we can quickly extract information that will help us work towards treatments and cures for many different diseases.
Breaking news… Zooniverse volunteers on Exoplanet Explorers have discovered a new 4-planet system!
Congratulations to all* who directly classified the light curves for this system, bringing it to the attention of the research team. And an enormous *thank you* to the 14,000+ volunteers who provided over 2 million classifications in just three days to make this discovery possible. This is equivalent to 3.4 years of full time effort. I *heart* people-powered research! It’s also amazing how quickly we were able to get these data to the eyes of the public — the Kepler Space Satellite observed this star between December 15 and March 4, 2017. Data arrived on Earth on March 7th and Zooniverse volunteers classified it April 3-5, 2017. I *heart* Zooniverse.
ExoplanetExplorers.org was the featured project for our inaugural ABC Australia Stargazing Live 3-day, prime-time TV event, which just ended yesterday and through which this discovery was made. Over the years, we’ve partnered with the BBC as part of their Stargazing Live event in the UK. On night 1, Chris Lintott, our intrepid leader, invites the million+ viewers to participate in that year’s featured Zooniverse project, on night 2 he highlights interesting potential results coming through the pipeline, and on night 3, if science nods in our favor, he has the pleasure of announcing exciting discoveries you all, our volunteers, have made (for example, last year’s pulsar discovery and the supernova discovery from a couple years back).
This year we partnered with both the UK’s BBC and Australia’s ABC TV networks to run two Stargazing Live series in two weeks. We’re exhausted and exhilarated from the experience! We can imagine you all are as well (hats off to one of our volunteers who provided over 15,000 classifications in the first two days)!
Stargazing Live epitomizes many of our favorite aspects of being a member of the Zooniverse team – it’s a huge rush, filled with the highs and lows of keeping a site up when thousands of people are suddenly providing ~7000 classifications a minute at peak. We’re so proud of our web development team and their amazing effort; their smart solutions, quick thinking, and teamwork. The best part is that we collectively get to experience the joy, wonder, and discovery of the process of science right alongside the researchers. Each year the research teams leading each project have what is likely among the most inspiring (and intense) 3-days of their careers, carrying out the detective work of following up each potential discovery at breakneck speed.
Brad Tucker and his team leading PlanetNineSearch.org featured in the BBC Stargazing Live event this year checked and rechecked dozens of Planet 9 candidates orbital parameters and against known object catalogs, making sure no stone was left unturned. We were bolstered throughout with re-discoveries of known objects, including many known asteroids and Chiron, a minor planet in the outer Solar System, orbiting the Sun between Saturn and Uranus.
Even though Planet 9 hasn’t been discovered yet, it’s huge progress for that field of research to have completed a thorough search through this Skymapper dataset, which allows us to probe out to certain distances and sizes of objects across a huge swath of the sky. Stay tuned for progress at planetninesearch.org and through the related BackyardWorlds.org project, searching a different parameter space for Planet 9 in WISE data.
Also, and very importantly, the BBC Stargazing Live shows gave the world an essential new member of the Twitterverse:
The Exoplanet Explorers team, led by Ian Crossfield, Jessie Christiansen, Geert Barentsen, Tom Barclay, and more were also up through much of each night of the event this week, churning through the results. Because the Kepler Space Telescope K2 dataset is so rich, there were dozens of potential candidates to triple check in just 3 days. Not only did our volunteers discover the 4-planet system shown above, but 90 new and true candidate exoplanets! That’s truly an amazing start to a project.
Once you all, our amazing community, have classified all the images in this project and the related PlanetHunters.org, the researchers will be able to measure the occurrence rates of different types of planets orbiting different types of stars. They’ll use this information to answer questions like — Are small planets (like Venus) more common than big ones (like Saturn)? Are short-period planets (like Mercury) more common than those on long orbits (like Mars)? Do planets more commonly occur around stars like the Sun, or around the more numerous, cooler, smaller “red dwarfs”?
There’s also so much to learn about the 4-planet system itself. It’s particularly interesting because it’s such a compact system (all orbits are well within Mercury’s distance to our Sun) of potentially rocky planets. If these characteristics hold true, we expect they will put planet formation theories to the test.
A fun part of our effort for the show was to create visualizations for this newly discovered system. Simone, one of our developers, used http://codepen.io/anon/pen/RpOYRw to create the simulation shown above. We welcome all to try their hand using this tool or others to create their favorite visualization of the system. Do post your effort in the comments below. To set you on the right path, here are our best estimates for the system so far:
The star is in the constellation of Aquarius (see if can get the WWT), with ra, dec = 23:15:47.77, -10:50:58.91.
Host star (V=12): 0.8 Rsol, 0.9 Msol. Late G or early K.
We predict there may be more planets further out, with similar resonances as the inner planets. The predictions for outer planets are 20d, 30.7d, 47d, etc. (assuming Per_x = 3.56 * 1.538^x.). Planet number 11 would be ~264d, planet 12 ~405d.
There are 73 other previously discovered exoplanet systems with 4 or more planets known.
In 2372 years, on July 9, 4388AD, all four planets will transit at the same time.
If you’re standing on planet e, the nearest planet would appear bigger than the full moon on the sky. Apparent size of other planets while standing on e = 10 arcmin, 16 arcmin, 32 arcmin.
If you’re on planet e, the star barely appears to rotate: you see the same side of it for many “years,” because the star rotates just as quickly as planet “e” goes around it.
This post wouldn’t be complete without a thank you to Edward Gomez for following up candidates with the Los Cumbres Observatory Robotic Telescope Network. Not only is LCO a great research tool, but it provides amazing access to telescopes and quality curricular materials for students around the world.
*And a special thanks to the following volunteers who correctly identified at least one the planets in the newly discovered 4-planet system:
Alan Patricio Zetina Floresmarhx
Anastasios D. Papanastasiou
Below is a guest post from a researcher who has been studying the Zooniverse and who just published a paper called ‘Crowdsourced Science: Sociotechnical epistemology in the e-research paradigm’. That being a bit of a mouthful, I asked him to introduce himself and explain – Chris.
My name is David Watson and I’m a data scientist at Queen Mary University of London’s Centre for Translational Bioinformatics. As an MSc student at the Oxford Internet Institute back in 2015, I wrote my thesis on crowdsourcing in the natural sciences. I got in touch with several members of the Zooniverse team, who were kind enough to answer all my questions (I had quite a lot!) and even provide me with an invaluable dataset of aggregated transaction logs from 2014. Combining this information with publication data from a variety of sources, I examined the impact of crowdsourcing on knowledge production across the sciences.
Last week, the philosophy journal Synthese published a (significantly) revised version of my thesis, co-authored by my advisor Prof. Luciano Floridi. We found that Zooniverse projects not only processed far more observations than comparable studies conducted via more traditional methods—about an order of magnitude more data per study on average—but that the resultant papers vastly outperformed others by researchers using conventional means. Employing the formal tools of Bayesian confirmation theory along with statistical evidence from and about Zooniverse, we concluded that crowdsourced science is more reliable, scalable, and connective than alternative methods when certain common criteria are met.
We were surprised by several things in our research, however. First, the significance of the disparity between the performance of publications by Zooniverse and those by other labs was greater than expected. This plot represents the distribution of citation percentiles by year and data source for articles by both groups. Statistical tests confirm what your eyes already suspect—it ain’t even close.
We were also impressed by the networks that appear in Zooniverse projects, which allow users to confer with one another and direct expert attention toward particularly anomalous observations. In several instances this design has resulted in patterns of discovery, in which users flag rare data that go on to become the topic of new projects. This structural innovation indicates a difference not just of degree but of kind between so-called “big science” and crowdsourced e-research.
If you’re curious to learn more about our study of Zooniverse and the site’s implications for sociotechnical epistemology, check out our complete article.