Category Archives: Statistics

Zooniverse Data Aggregation

Hi all, I am Coleman Krawczyk and for the past year I have been working on tools to help Zooniverse research teams work with their data exports.  The current version of the code (v1.3.0) supports data aggregation for nearly all the project builder task types, and support will be added for the remaining task types in the coming months.

What does this code do?

This code provides tools to allow research teams to process and aggregate classifications made on their project, or in other words, this code calculates the consensus answer for a given subject based on the volunteer classifications.  

The code is written in python, but it can be run completely using three command line scripts (no python knowledge needed) and a project’s data exports.

Configuration

The first script is the uses a project’s workflow data export to auto-configure what extractors and reducers (see below) should be run for each task in the workflow.  This produces a series of `yaml` configuration files with reasonable default values selected.

Extraction

Next the extraction script takes the classification data export and flattens it into a series of `csv` files, one for each unique task type, that only contain the data needed for the reduction process.  Although the code tries its best to produce completely “flat” data tables, this is not always possible, so more complex tasks (e.g. drawing tasks) have structured data for some columns.

Reduction

The final script takes the results of the data extraction and combine them into a single consensus result for each subject and each task (e.g. vote counts, clustered shapes, etc…).  For more complex tasks (e.g. drawing tasks) the reducer’s configuration file accepts parameters to help tune the aggregation algorithms to best work with the data at hand.

A full example using these scripts can be found in the documentation.

Future for this code

At the moment this code is provided in its “offline” form, but we testing ways for this aggregation to be run “live” on a Zooniverse project.  When that system is finished a research team will be able to enter their configuration parameters directly in the project builder, a server will run the aggregation code, and the extracted or reduced `csv` files will be made available for download.

What’s going on with the classify interface? Part three

Part three in a multi-part series exploring the visual and UX changes to the Zooniverse classify interface

Coming soon!

Today we’ll be going over a couple of visual changes to familiar elements of the classify interface and new additions we’re excited to premier. These updates haven’t been implemented yet, so nothing is set in stone. Please use this survey to send me feedback about these or any of the other updates to the Zooniverse.

Keyboard shortcut modal

New modals

Many respondents to my 2017 design survey requested that they be able to use the keyboard to make classifications rather than having to click so many buttons. One volunteer actually called the classifier “a carpal-tunnel torturing device”. As a designer, that’s hard to hear – it’s never the goal to actively injure our volunteers.

We actually do support keyboard shortcuts! This survey helped us realize that we need to be better at sharing some of the tools our developers have built. The image above shows a newly designed Keyboard Shortcut information modal. This modal (or “popup”) is a great example of a few of the modals we’re building – you can leave it open and drag it around the interface while you work, so you’ll be able to quickly refer to it whenever you need.

This behavior will be mirrored in a few of the modals that are currently available to you:

  • Add to Favorites
  • Add to Collection / Create a New Collection
  • Subject Metadata
  • “Need Help?”

It will also be applied to a few new ones, including…

Field Guide

New field guide layout

Another major finding from the design survey was that users did not have a clear idea where to go when they needed help with a task (see chart below).

Survey results show a mix of responses

We know research teams often put a lot of effort into their help texts, and we wanted to be sure that work was reaching the largest possible audience. Hence, we moved the Field Guide from a small button on the right-hand side of the screen – a place that can become obscured by the browser’s scrollbar – and created a larger, more prominent button in the updated toolbar:

By placing the Field Guide button in a more prominent position and allowing the modal to stay open during classifications, we hope this tool will be taken advantage of more than it currently is.

The layout was the result of the audit of every live project I conducted in spring 2017:

Field Guide
Mode item count 5 Mode label word count 2
Min item count 2 Min label word count 2
Max items count 45 Max label word count 765

Using the mode gave me the basis on which to design; however, there’s quite a disparity between min and max amounts. Because of this disparity, we’ll be giving project owners with currently active projects a lot of warning before switching to the new layout, and they’ll have the option to continue to use the current Field Guide design if they’d prefer.

Tutorial

Another major resource Zooniverse offers its research teams and volunteers is the Tutorial. Often used to explain project goals, welcome new volunteers to the project, and point out what to look for in an image, the current tutorial is often a challenge because its absolute positioning on top of the subject image.

No more!

In this iteration of the classify interface, the tutorial opens once as a modal, just as it does now, and then lives in a tab in the task area where it’s much more easily accessible. You’ll be able to switch to the Tutorial tab in order to compare the example images and information with the subject image you’re looking at, rather than opening and closing the tutorial box many times.

A brand-new statistics section

Another major comment from the survey was that volunteers wanted more ways to interact with the Zooniverse. Thus, you’ll be able to scroll down to find a brand-new section! Features we’re adding will include:

  • Your previous classifications with Add to Favorites or Add to Collection buttons
  • Interesting stats, like the amount of classifications you’ve done and the amount of classifications your community have done
  • Links to similar projects you might be interested in
  • Links to the project’s blog and social media to help you feel more connected to the research team
  • Links to the project’s Talk boards, for a similar purpose
  • Possibly: A way to indicate that you’re finished for the day, giving you the option to share your experience on social media or find another project you’re interested in.

The statistics we chose were directly related to the responses from the survey:

Survey results

Respondents were able to choose more than one response; when asked to rank them in order of importance, project-wide statistics were chosen hands-down:

Project-wide statistics are the most important

We also heard that volunteers sometimes felt disconnected from research teams and the project’s accomplishments:

“In general there is too less information about the achievement of completed projects. Even simple facts could cause a bit of a success-feeling… how many pictures in this project over all have been classified? How much time did it take? How many hours were invested by all participating citizens? Were there any surprising things for the scientists? Things like that could be reported long before the task of a project is completely fullfilled.”

Research teams often spend hours engaged in dialog with volunteers on Talk, but not everyone who volunteers on Zooniverse is aware or active on Talk. Adding a module on the classify page showing recent Talk posts will bring more awareness to this amazing resource and hopefully encourage more engagement from volunteers.

Templates for different image sizes and dimensions

When the project builder was created, we couldn’t have predicted the variety of disparate topics that would become Zooniverse projects. Originally, the subject viewer was designed for one common image size, roughly 2×3, and other sizes have since been shoehorned in to fit as well as they can.

Now, we’d like to make it easier for subjects with extreme dimensions, multimedia subjects, and multi-image subjects to fit better within the project builder. By specifically designing templates and allowing project owners to choose the one that best fits their subjects, volunteers and project owners alike will have a better experience.

Very wide subjects will see their toolbar moved to the bottom of the image rather than on the right, to give the image as much horizontal space as possible. Tall subjects will be about the same width as they have been, but the task/tutorial box will stay fixed on the screen as the image scrolls, eliminating the need to scroll up and down as often when looking at the bottom of the subject.

Wide and tall subjects

Let’s get started!

I’m so excited for the opportunity to share a preview of these changes with you. Zooniverse is a collaborative project, so if there’s anything you’d like us to address as we implement this update, please use this survey to share your thoughts and suggestions. Since we’re rolling these out in pieces, it will be much easier for us to be able to iterate, test, and make changes.

We estimate that the updates will be mostly in place by early 2019, so there’s plenty of time to make sure we’re creating the best possible experience for everyone.

Thank you so much for your patience and understanding as we move forward. In the future, we’ll be as open and transparent as possible about this process.

What’s going on with the classify interface? Part One

Part one in a multi-part series exploring the visual and UX changes to the Zooniverse classify interface

First, an introduction.

Zooniverse began in 2007, with a galaxy-classifying project called Galaxy Zoo. The project was wildly successful, and one of the lead researchers, Chris Lintott, saw an opportunity to help other researchers accomplish similar goals. He assembled a team of developers and set to work building custom projects just like Galaxy Zoo for researchers around the world.

And things were good.

But the team started to wonder: How can we improve the process to empower researchers to build their own Zooniverse projects, rather than relying on the team’s limited resources to build their projects for them?

Thus, the project builder (zooniverse.org/lab) was born.

In the first year of its inception, the number of projects available to citizen scientist volunteers nearly doubled. Popularity spread, the team grew, and things seemed to be going well.

That’s where I come in. * Record scratch *

Three years after the project builder’s debut, I was hired as the Zooniverse designer. With eight years’ experience in a variety of design roles from newspaper page design to user experience for mobile apps to web design, I approached the new project builder-built projects with fresh eyes, taking a hard look at what was working and what areas could be improved.

Over the next week, I’ll be breaking down my findings and observations, and talking through the design changes we’re making, shedding more light on the aims and intentions behind these changes and how they will affect your experience on the Zooniverse platform.

If you take one thing away from this series it’s that this design update, in following with the ethos of Zooniverse, is an iterative, collaborative process. These posts represent where we are now, in June 2018, but the final product, after testing and hearing your input, may be different. We’re learning as we go, and your input is hugely beneficial as we move forward.

Here’s a link to an open survey in case you’d like to share thoughts, experiences, or opinions at any point.

Let’s dive in.

Part one: Research

My first few weeks on the job were spent exploring Zooniverse, learning about the amazing world of citizen science, and examining projects with similar task types from across the internet.

I did a large-scale analysis of the site in general, going through every page in each section and identifying areas with inconsistent visual styles or confusing user experiences.

Current site map, March 2017
Analysis of current template types

After my initial site analysis, I created a list of potential pages or sections that were good candidates for a redesign. The classify interface stood out as the best place to start, so I got to work.

Visual design research

First, I identified areas of the interface that could use visual updates. My main concerns were legibility, accessibility, and varying screen sizes. With an audience reaching to the tens of thousands per week, the demographic diversity makes for an interesting design challenge.

Next, I conducted a comprehensive audit of every project that existed on the Zooniverse in March 2017 (79 in total, including custom projects like Galaxy Zoo), counting question/task word count, the max number of answers, subject image dimensions, field guide content, and a host of other data points. That way, I could accurately design for the medians rather than choosing arbitrarily. When working on this scale, it’s important to use data like these to ensure that the largest possible group is well designed for.

Here are some selected data:

Task type: Drawing 20
Answers
Average number of possible answers 2 Answer average max word count 4.5
Min number 1 Answer max max word count 10
Max number 7 Answer min max word count 2
Median number 1 Answer median max word count 1
Number with thumbnail images 1

 

Task type: Question 9
Answers
Average number of possible answers 6 Answer average max word count 6
Min number 2 Answer max max word count 18
Max number 9 Answer min max word count 1
Median number 3.5 Answer median max word count 4
Number with thumbnail images 3

 

Task type: Survey 9
Answers
Average number of possible answers 31 Answer average max word count 4
Min number 6 Answer max max word count 7
Max number 60 Answer min max word count 3
Median number 29 Answer median max word count 4
Number with thumbnail images 9

Even More Research

Next, I focused on usability. To ensure that I understood issues from as many perspectives as possible, I sent a design survey to our beta testers mailing list, comprising about 100,000 volunteers (if you’re not already on the list, you can opt in via your Zooniverse email settings). Almost 1,200 people responded, and those responses informed the decisions I made and helped prioritize areas of improvement.

Here are the major findings from that survey:

  • No consensus on where to go when you’re not sure how to complete a task.
  • Many different destinations after finishing a task.
  • Too much scrolling and mouse movement.
  • Lack of keyboard shortcuts.
  • Would like the ability to view previous classifications.
  • Translations to more languages.
  • Need for feedback when doing classifications.
  • Finding new projects that might also be interesting.
  • Larger images.

In the next few blog posts, I’ll be breaking down specific features of the update and showing how these survey findings help inform the creation of many of the new features.

Without further ado

Basic classify template

Some of these updates will look familiar, as we’ve already started to implement style and layout adjustments. I’ll go into more detail in subsequent posts, but at a high level, these changes seek to improve your overall experience classifying on the site no matter where you are, what browser you’re using, or what type of project you’re working on.  

Visually, the site is cleaner and more professional, a reflection of Zooniverse’s standing in the citizen science community and of the real scientific research that’s being done. Studies have shown that good, thoughtful design influences a visitor’s perceptions of a website or product, sometimes obviously, sometimes at a subliminal level. By making thoughtful choices in the design of our site, we can seek to positively affect audience perceptions about Zooniverse, giving volunteers and researchers even more of a reason to feel proud of the projects they’re passionate about.

It’s important to note that this image is a reflection of our current thought, in June 2018, but as we continue to test and get feedback on the updates, the final design may change. One benefit to rolling updates out in pieces is the ability to quickly iterate ideas until the best solution is found.

The timeline

We estimate that the updates will be mostly in place by early 2019.

This is due in part to the size of our team. At most, there are about three people working on these updates while also maintaining our commitments to other grant-funded projects and additional internal projects. The simple truth is that we just don’t have the resources to be able to devote anyone full-time to this update.

The timeline is also influenced in a large part by the other half of this update: A complete overhaul of the infrastructure of the classifier. These changes aren’t as visible, but you’ll notice an improvement in speed and functionality that is just as important as the “facelift” portion of the update.

Stay tuned!

We’ve seen your feedback on Talk, via email, and on Github, and we’re happy to keep a dialog going about subsequent updates. To streamline everything and make sure your comments don’t get missed, please only use this survey link to post thoughts moving forward.

Measuring Success in Citizen Science Projects, Part 2: Results

In the previous post, I described the creation of the Zooniverse Project Success Matrix from Cox et al. (2015). In essence, we examined 17 (well, 18, but more on that below) Zooniverse projects, and for each of them combined 12 quantitative measures of performance into one plot of Public Engagement versus Contribution to Science:

Public engagement vs Contribution to science : the success matrix
Public Engagement vs Contribution to Science for 17 Zooniverse projects. The size (area) of each point is proportional to the total number of classifications received by the project. Each axis of this plot combines 6 different quantitative project measures.

The aim of this post is to answer the questions: What does it mean? And what doesn’t it mean?

Discussion of Results

The obvious implication of this plot and of the paper in general is that projects that do well in both public engagement and contribution to science should be considered “successful” citizen science projects. There’s still room to argue over which is more important, but I personally assert that you need both in order to justify having asked the public to help with your research. As a project team member (I’m on the Galaxy Zoo science team), I feel very strongly that I have a responsibility both to use the contributions of my project’s volunteers to advance scientific research and to participate in open, two-way communication with those volunteers. And as a volunteer (I’ve classified on all the projects in this study), those are the 2 key things that I personally appreciate.

It’s apparent just from looking at the success matrix that one can have some success at contributing to science even without doing much public engagement, but it’s also clear that every project that successfully engages the public also does very well at research outputs. So if you ignore your volunteers while you write up your classification-based results, you may still produce science, though that’s not guaranteed. On the other hand, engaging with your volunteers will probably result in more classifications and better/more science.

Surprises, A.K.A. Failing to Measure the Weather

Some of the projects on the matrix didn’t appear quite where we expected. I was particularly surprised by the placement of Old Weather. On this matrix it looks like it’s turning in an average or just-below-average performance, but that definitely seems wrong to me. And I’m not the only one: I think everyone on the Zooniverse team thinks of the project as a huge success. Old Weather has provided robust and highly useful data to climate modellers, in addition to uncovering unexpected data about important topics such as the outbreak and spread of disease. It has also provided publications for more “meta” topics, including the study of citizen science itself.

Additionally, Old Weather has a thriving community of dedicated volunteers who are highly invested in the project and highly skilled at their research tasks. Community members have made millions of annotations on log data spanning centuries, and the researchers keep in touch with both them and the wider public in multiple ways, including a well-written blog that gets plenty of viewers. I think it’s fair to say that Old Weather is an exceptional project that’s doing things right. So what gives?

There are multiple reasons the matrix in this study doesn’t accurately capture the success of Old Weather, and they’re worth delving into as examples of the limitations of this study. Many of them are related to the project being literally exceptional. Old Weather has crossed many disciplinary boundaries, and it’s very hard to put such a unique project into the same box as the others.

Firstly, because of the way we defined project publications, we didn’t really capture all of the outputs of Old Weather. The use of publications and citations to quantitatively measure success is a fairly controversial subject. Some people feel that refereed journal articles are the only useful measure (not all research fields use this system), while others argue that publications are an outdated and inaccurate way to measure success. For this study, we chose a fairly strict measure, trying to incorporate variations between fields of study but also requiring that publications should be refereed or in some other way “accepted”. This means that some projects with submitted (but not yet accepted) papers have lower “scores” than they otherwise might. It also ignores the direct value of the data to the team and to other researchers, which is pretty punishing for projects like Old Weather where the data itself is the main output. And much of the huge variety in other Old Weather outputs wasn’t captured by our metric. If it had been, the “Contribution to Science” score would have been higher.

Secondly, this matrix tends to favor projects that have a large and reasonably well-engaged user base. Projects with a higher number of volunteers have a higher score, and projects where the distribution of work is more evenly spread also have a higher score. This means that projects where a very large fraction of the work is done by a smaller group of loyal followers are at a bit of a disadvantage by these measurements. Choosing a sweet spot in the tradeoff between broad and deep engagement is a tricky task. Old Weather has focused on, and delivered, some of the deepest engagement of all our projects, which meant these measures didn’t do it justice.

To give a quantitative example: the distribution of work is measured by the Gini coefficient (on a scale of 0 to 1), and in our metric lower numbers, i.e. more even distributions, are better. The 3 highest Gini coefficients in the projects we examined were Old Weather (0.95), Planet Hunters (0.93), and Bat Detective (0.91); the average Gini coefficient across all projects was 0.82. It seems clear that a future version of the success matrix should incorporate a more complex use of this measure, as very successful projects can have high Gini coefficients (which is another way of saying that a loyal following is often a highly desirable component of a successful citizen science project).

Thirdly, I mentioned in part 1 that these measures of the Old Weather classifications were from the version of the project that launched in 2012. That means that, unlike every other project studied, Old Weather’s measures don’t capture the surge of popularity it had in its initial stages. To understand why that might make a huge difference, it helps to compare it to the only eligible project that isn’t shown on the matrix above: The Andromeda Project.

In contrast to Old Weather, The Andromeda Project had a very short duration: it collected classifications for about 4 weeks total, divided over 2 project data releases. It was wildly popular, so much so that the project never had a chance to settle in for the long haul. A typical Zooniverse project has a burst of initial activity followed by a “long tail” of sustained classifications and public engagement at a much lower level than the initial phase.

The Andromeda Project is an exception to all the other projects because its measures are only from the initial surge. If we were to plot the success matrix including The Andromeda Project in the normalizations, the plot looks like this:

success matrix with the andromeda project making all the others look like public engagement failures
And this study was done before the project’s first paper was accepted, which it has now been. If we included that, The Andromeda Project’s position would be even further to the right as well.

Because we try to control for project duration, the very short duration of the Andromeda Project means it gets a big boost. Thus it’s a bit unfair to compare all the other projects to The Andromeda Project, because the data isn’t quite the same.

However, that’s also true of Old Weather — but instead of only capturing the initial surge, our measurements for Old Weather omit it. These measurements only capture the “slow and steady” part of the classification activity, where the most faithful members contribute enormously but where our metrics aren’t necessarily optimized. That unfairly makes Old Weather look like it’s not doing as well.

In fact, comparing these 2 projects has made us realize that projects probably move around significantly in this diagram as they evolve. Old Weather’s other successes aren’t fully captured by our metrics anyway, and we should keep those imperfections and caveats in mind when we apply this or any other success measure to citizen science projects in the future; but one of the other things I’d really like to see in the future is a study of how a successful project can expect to evolve across this matrix over its life span.

Why do astronomy projects do so well?

There are multiple explanations for why astronomy projects seem to preferentially occupy the upper-right quadrant of the matrix. First, the Zooniverse was founded by astronomers and still has a high percentage of astronomers or ex-astronomers on the payroll. For many team members, astronomy is in our wheelhouse, and it’s likely this has affected decisions at every level of the Zooniverse, from project selection to project design. That’s starting to change as we diversify into other fields and recruit much-needed expertise in, for example, ecology and the humanities. We’ve also launched the new project builder, which means we no longer filter the list of potential projects: anyone can build a project on the Zooniverse platform. So I think we can expect the types of projects appearing in the top-right of the matrix to broaden considerably in the next few years.

The second reason astronomy seems to do well is just time. Galaxy Zoo 1 is the first and oldest project (in fact, it pre-dates the Zooniverse itself), and all the other Galaxy Zoo versions were more like continuations, so they hit the ground running because the science team didn’t have a steep learning curve. In part because the early Zooniverse was astronomer-dominated, many of the earliest Zooniverse projects were astronomy related, and they’ve just had more time to do more with their big datasets. More publications, more citations, more blog posts, and so on. We try to control for project age and duration in our analysis, but it’s possible there are some residual advantages to having extra years to work with a project’s results.

Moreover, those early astronomy projects might have gotten an additional boost from each other: they were more likely to be popular with the established Zooniverse community, compared to similarly early non-astronomy projects which may not have had such a clear overlap with the established Zoo volunteers’ interests.

Summary

The citizen science project success matrix presented in Cox et al. (2015) is the first time such a diverse array of project measures have been combined into a single matrix for assessing the performance of citizen science projects. We learned during this study that public engagement is well worth the effort for research teams, as projects that do well at public engagement also make better contributions to science.

It’s also true that this matrix, like any system that tries to distill such a complex issue into a single measure, is imperfect. There are several ways we can improve the matrix in the future, but for now, used mindfully (and noting clear exceptions), this is generally a useful way to assess the health of a citizen science project like those we have in the Zooniverse.

Note: Part 1 of this article is here.

Measuring Success in Citizen Science Projects, Part 1: Methods

What makes one citizen science project flourish while another flounders? Is there a foolproof recipe for success when creating a citizen science project? As part of building and helping others build projects that ask the public to contribute to diverse research goals, we think and talk a lot about success and failure at the Zooniverse.

But while our individual definitions of success overlap quite a bit, we don’t all agree on which factors are the most important. Our opinions are informed by years of experience, yet before this year we hadn’t tried incorporating our data into a comprehensive set of measures — or “metrics”. So when our collaborators in the VOLCROWE project proposed that we try to quantify success in the Zooniverse using a wide variety of measures, we jumped at the chance. We knew it would be a challenge, and we also knew we probably wouldn’t be able to find a single set of metrics suitable for all projects, but we figured we should at least try to write down one possible approach and note its strengths and weaknesses so that others might be able to build on our ideas.

The results are in Cox et al. (2015):

Defining and Measuring Success in Online Citizen Science: A Case Study of Zooniverse Projects

In this study, we only considered projects that were at least 18 months old, so that all the projects considered had a minimum amount of time to analyze their data and publish their work. For a few of our earliest projects, we weren’t able to source the raw classification data and/or get the public-engagement data we needed, so those projects were excluded from the analysis. We ended up with a case study of 17 projects in all (plus the Andromeda Project, about which more in part 2).

The full paper is available here (or here if you don’t have academic institutional access), and the purpose of these blog posts is to summarize the method and discuss the implications and limitations of the results. Continue reading Measuring Success in Citizen Science Projects, Part 1: Methods

Who Are The Zooniverse Community? We Asked Them…

We are often asked who our community are by project scientists, sociologists, and by the community itself. A recent Oxford study tried to find out, and working with them we conducted a survey of volunteers. The results were interesting and when combined with various statistics that we have at Zooniverse (web logs, analytics, etc) we can start to see a pretty good picture of who volunteers at the Zooniverse.

Much of what follows comes from a survey was conducted last Summer as part of Masters student Victoria Homsy’s thesis, though the results are broadly consistent with other surveys we have performed.  We asked a small subset of the Zooniverse community to answer an online questionnaire. We contacted about 3000 people regarding the survey and around 300 responded. They were not a random sample of users, rather they were people who had logged-in to the Zooniverse at least once in the three months before we emailed them.

The remaining aspects of this post involve data gathered by our own system (classification counts, log-in rates, etc) and data from our use of Google Analytics.

So with that preamble done: let’s see who you are…

https://vimeo.com/99664654

This visualisation is of Talk data from last Summer. It doesn’t cover every project (e.g. Planet Hunters is missing) but it gives you a good flavour for how our community is structured. Each node (circle) is one volunteer, sized proportionally according to how many posts they have made overall. You can see one power-mod who has commented more than 16,000 times on Talk near the centre. Volunteers are connected to others by talking in the same threads (a proxy for having conversations). They have been automatically coloured by network analysis, to reflect sub-networks within the Zooniverse as a whole. The result is that we see the different projects’ Talk sites.

talk-central

There are users that rise largely out of those sub-communities and talk across many sites, but mostly people stick to one group. You can also see how relatively few power users help glue the whole together, and how there are individuals talking to large numbers of others, who in turn may not participate much otherwise – these are likely examples of experienced users answering questions from others.

gender One thing we can’t tell from our own metrics is a person’s gender, but we did ask in the survey. The Zooniverse community seems to be in a 60/40 split, which in some ways is not as bad as I would have thought. However, we can do better, and this provides a metric to measure ourselves against in the future.

ages

It is also interesting to note that there is very little skew in the ages of our volunteers. There is a slight tilt away from older people, but overall the community appears to be made up of people of all ages. This reflects the experience of chatting to people on Talk.

geo-pie

We know that the Zooniverse is English-language dominated, and specifically UK/US dominated. This is always where we have found the best press coverage, and where we have the most links ourselves. The breakdown between US/UK/the rest is basically a three-way split. This split is seen not just in this survey but also generally in our analytics overall.

geo-pie-dev

Only 2% of the users responding to our survey only came from the developing world. As you can see in a recent blog post, we do get visitors from all over the world. It may be that the survey has the effect of filtering out these people (it was conducted via an online form), or maybe that there is language barrier.

employmentemployment_cloudWe also asked people about their employment status. We find a about half of our community is employed (either full- or part-time). Looking at the age distribution, we might expect up a fifth or sixth of people to be retired (15% is fairly close). This leaves us with about 10% unemployed, nearly twice the UK or US unemployment rate, and about 4% unable to work due to disability (about the UK averaged, by comparison). This is interesting, especially in relation to the next question, on motivation for participating.

We also asked them to tell us what they do and the result is the above word cloud (thanks, Wordle!) which shows a wonderful array of occupations including professor, admin, guard, and dogsbody. You should note a high instance of technical jobs on this list, possibly indicating that people need to have, or be near, a computer to work on Zooniverse projects in their daily life.

motivation

When asked why they take part in Zooniverse projects we find that the most-common response (91%) is a desire to contribute to progress. How very noble. Closely following that (84%) are the many people who are interested in the subject matter. It falls of rapidly then to ‘entertainment’, ‘distraction’ and ‘other’. We are forever telling people that the community is motivated mainly by science and contribution, and for whatever reason they usually don’t believe us. It’s nice to see this result reproducing an important part of the Raddick et. al. 2009 study, which first demonstrated it.

when-to-classfy-routine

It is roughly what I would have expected to see that people tend to classify mostly in their spare time, and that most don’t have dedicated ‘Zooniverse’ time every day. It’s more interesting to see why, if they tend to stop and start, i.e. if they answered in the purple category above. Here is a word cloud showing the reason people stop participating in Zooniverse. TL;DR they have the rest of their life to get on with.

when-to-classfy-routine-cloud

We’ll obviously have to fix this by making Zooniverse their whole life!

This is my final blog post as a part of the Zooniverse team. It has been by pleasure to work at the Zooniverse for the last five years. Much of that time has been spent trying to motivate and engage the amazing community of volunteers who come to click, chat, and work on all our projects. You’re an incredible bunch, motivated by science and a desire to be part of something important and worthwhile online. I think you’re awesome. In the last five years I have seen the Zooniverse grow into a community of more than one million online volunteers, willing to tackle big questions, and trying and understand the world around us.

Thank you for your enthusiasm and your time. I’ll see you online…

Introducing VOLCROWE – Volunteer and Crowdsourcing Economics

volcrowe

Hi everyone, I’d like to let you know about a cool new project we are involved with. VOLCROWE is a three year research project funded by the Engineering and Physical Sciences Research Council in the UK, bringing together a team of researchers (some of which are already involved with the Zooniverse, like Karen Masters) from the Universities of Portsmouth, Oxford, Manchester and Leeds. The PI of the project Joe Cox says “Broadly speaking, the team wants to understand more about the economics of the Zooniverse, including how and why it works in the way that it does. Our goal is to demonstrate to the community of economics and management scholars the increasingly amazing things that groups of people can achieve when they work together with a little help from technology. We believe that Zooniverse projects represent a specialised form of volunteering, although the existing literature on the economics of altruism hasn’t yet taken into account these new ways in which people can give their time and energy towards not-for-profit endeavours. Working together with Zooniverse volunteers, we intend to demonstrate how the digital economy is making it possible for people from all over the world to come together in vast numbers and make a contribution towards tackling major scientific problems such as understanding the nature of the Universe, climate change and even cancer.

These new forms of volunteering exemplified by the Zooniverse fundamentally alter the voluntary process as it is currently understood. The most obvious change relates to the ways in which people are able to give their time more flexibly and conveniently; such as contributing during their daily commute using a smart phone! It also opens new possibilities for the social and community aspects of volunteering in terms of creating a digitally integrated worldwide network of contributors. It may also be the case that commonly held motivations and associations with volunteering don’t hold or work differently in this context. For example, religious affiliations and memberships may or may not be as prevalent as they are with more traditional or recognised forms of volunteering. With the help of Zooniverse volunteers, the VOLCROWE team are exploring all of these issues (and more) with the view to establishing new economic models of digital volunteering.

To achieve this aim, we are going to be interacting with the Zooniverse community in a number of ways. First, we’ll be conducting a large scale survey to find out more about its contributors (don’t worry – you do not have to take part in the survey or give any personal information if you do not want to!). The survey data will be used to test the extent to which assumptions made by existing models of volunteering apply and, if necessary, to formulate new ones. We’ll also be taking a detailed look at usage statistics from a variety of projects and will test for trends in the patterns of contributions across the million (and counting) registered Zooniverse volunteers. This larger-scale analysis will be supplemented with a number of smaller sessions with groups of volunteers to help develop a more nuanced understanding of people’s relationships with and within the Zooniverse. Finally, we’ll be using our expertise from the economic and management sciences to study the organisation of the Zooniverse team themselves and analyse the ways and channels they use to communicate and to make decisions. In short, with the help of its volunteers, we want to find out what makes the Zooniverse tick!

In the survey analysis, no information will be collected that could be used to identify you personally. The only thing we will ask for is a Zooniverse ID so that we can match up your responses to your actual participation data; this will help us address some of the project’s most important research questions. The smaller group and one-to-one sessions will be less anonymous by their very nature, but participation will be on an entirely voluntary basis and we will only ever use the information we gather in a way in which you’re comfortable. The team would really appreciate your support and cooperation in helping us to better understand the processes and relationships that drive the Zooniverse. If we can achieve our goals, we may even be able to help to make it even better!”

Keep an eye out for VOLCROWE over the coming weeks and months; they’d love you to visit their website and follow them on Twitter.

Grant and the Zooniverse Team

ZooTools: Going Deeper With Zooniverse Project Data

One of the best things about being an educator on the Zooniverse development team is the opportunity to interact with teachers who are using Zooniverse projects in their classroom and teachers who are interested in using Zooniverse projects in the classroom. Teachers cite several reasons about why they use these projects – Authentic data?  Check. Contributing to cutting-edge research across a variety of scientific fields?  Check.  Free?  Check. Classifying a few galaxies in Galaxy Zoo or identifying and measuring some plankton in Plankton Portal can be an exciting introduction to participating in scientific investigations with “the professionals.”  This isn’t enough though; teachers and other educators are hungry for ways to facilitate deeper student engagement with scientific data. Zooniverse educators and developers are consistently asked “How can my students dig deeper into the data on Zooniverse?”

This is where ZooTools comes into play. The Zooniverse development team has recently created ZooTools as a place where volunteers can observe, collect, and analyze data from Zooniverse citizen science projects. These tools were initially conceived as a toolkit for adult volunteers to use to make discoveries within Zooniverse data but it is becoming apparent that these would also have useful applications in formal education settings. It’s worth pointing out that these tools are currently in beta. In the world of web development beta basically means “it ain’t perfect yet.”  ZooTools is not polished and perfect; in fact it’s possible you may encounter some bugs.

Projects like Galaxy Zoo and Planet Hunters have an impressive history of “extra credit” discoveries made by volunteers.  Galaxy Zoo volunteers have made major contributions to the astronomy literature through the discovery of the green peas galaxies and Hanny’s Voorwerp .  In Planet Hunters volunteers use Talk to share methods of exploring and results from the project’s light curves.  ZooTools lowers the barrier of entry by equipping volunteers with some simple tools to look for interesting relationships and results contained within the data.  No specialist knowledge required.

We’ve only begun thinking about how ZooTools could be used in the classroom.  I started my own investigation with a question that came from a Zooniverse classroom visit from last spring.  While making observations as a class about some of the amazing animals in Snapshot Serengeti one young man asked about civets. He wanted to know If they were nocturnal. We had an interesting discussion about how you could find out this information.  The general consensus was to Google it or look it up on Wikipedia.  I wondered if you could use the data contained within Snapshot Serengeti to come up with a reasonable answer.  I was excited to roll-up my sleeves and figure out how to use these tools to find a likely answer.  Here are the steps I took…

Step 1: Log-in to Zooniverse and go to ZooTools.

Step 1

Step 2: Select a project. Currently only have a few projects have data available to explore using ZooTools.

Step 2

Step 3: Create a dashboard.

Step 3

Step 4: Name your dashboard something awesome. I called mine Civets! for obvious reasons.

Step 4

Step 5: This is your blank dashboard.

Step 5

Step 6: It’s time to select a data source. I selected Snapshot Serengeti.

Step 6

Step 7: This is the data source.

Step 7

Step 8: I wanted to be able to filter my data so I selected Filter under search type. The name of this dataset in Snapshot Serengeti 1.

Step 8

Step 9: Since I wanted to look at civets, I selected that on the species dropdown menu and then clicked Load Data. My dataset will only contain images that Snapshot Serengeti volunteers identified as civets.

Step 9

Step 10: I had my data; next it was time to select a Tool.  I selected Tools at the top of the page.

Step 10

Step 11: I selected Subject Viewer because this tool allows my to flip through different images.

Step 11

Step 12: Next I had to connect my data source to my tool. From the Data Source drop down menu I selected Snapshot Serengeti 1.

Step 12

Step 13: In order to get a good luck at the images in my dataset I clicked the icon shaped like a fork to close the pane.  I then used the arrows to advance through the images.

Step 13

I flipped through the images and kept track of the night versus day. Of the 37 images in my dataset, I observed that 34 were taken at night and 3 were taken during the day.  This led me to the conclusion that civets are likely nocturnal.  This was so much more satisfying than just going to Google or Wikipedia. A couple of other questions that I explored…

What is the distribution of animals identified at one camera trap site?

14

 

How many honeybadgers have been observed by Snapshot Serengeti volunteers across different camera traps?

Screen Shot 2013-11-26 at 3.17.28 PM

Of course this is just the tip of the iceberg.  Currently you can explore Galaxy Zoo, Space Warps, and Snapshot Serengeti data using ZooTools. Currently you can use ZooTools to explore data from Galaxy Zoo, Space Warps, and Snapshot Serengeti.  The specific tools and datasets available vary from project to project.  In Galaxy Zoo for example you can look at data from Galaxy Zoo classifications or from SDSS Skyserver. Hopefully you’ll be inspired to have a play with these tools!  What questions would you or your students like to explore?

The Elise Andrew Effect – What a post on IFLS does to your numbers

AP-IFLS

Recenty the Andromeda Project was the feature of one of the posts on the ‘I fucking Love Science’ Facebook page. The page, which was started by Elise Andrew in March 2012, currently has 8 million likes, so some form of noticeable impact was to be expected! Here are some of the interesting numbers the post is responsible for:

I’ll start with the Facebook post itself. As of writing (16 hours after original posting), it has been shard 1,842 times, liked by 6,494 people and has 218 comments. These numbers are actually relatively low for an IFLS post, some of which can reach over 70,000 shares!

AP-IFLS-2
The ‘IFLS spike’ in the Andromeda Project classifications and active users

Let’s now have a look at what it did for the Andromeda Project. The project, which was launched two days previous and was already pretty popular, had settled down to around 100 active users per hour. This number shot up to almost 600 immediately following the post. In the space of 5 minutes the number of visitors on the site went from 13 to 1,300! After a few hours it settled down again, but now the steady rate looks to be about 25% higher than before. The number of classifications per hour follows the same pattern. The amazing figure here is that almost 100,000 classifications were made in the 4 hours following the post. This number corresponds to around 1/6th of the total needed to complete the project!

PH-IFLS-spike
The number of visitors per day to the Planet Hunters site over the last two weeks. Visits increased by a factor of ten on the day of the IFLS post, and three days later the numbers are still greater than before.

Two days after her post about the Andromeda Project, Elise put up a post about the discovery of a seventh planet around the dwarf star KIC 11442793, which was found by citizen scientist on the Planet Hunters project. This post proved even more popular than the previous one with more than 3,000 shares, and led to a similar spike of the same magnitude in the number of visitors to the site (as can be seen in the plot above).

Finally, what did it do for the Zooniverse as a whole? Well there have been over 4,000 new Zooniverse accounts registered within the last four days and the Facebook page, which was linked in the AP article, got a healthy boost of around 1,000 new likes. So all things considered, it seems that an IFLS post can be very useful for promoting your project indeed!

Thanks Elise, the Andromeda Project, Planet Hunters and  Zooniverse teams love you!

Welcome to the Worm Watch Lab

Today we launch a new Zooniverse project in association with the Medical Research Council (MRC) and the Medical Research Foundation: Worm Watch Lab.

We need the public’s help in observing the behaviour of tiny nematode worms. When you classify on wormwatchlab.org you’re shown a video of a worm wriggling around. The aim of the game is to watch and wait for the worm to lay eggs, and to hit the ‘z’ key when they do. It’s very simple and strangely addictive. By watching these worms lay eggs, you’re helping to collect valuable data about genetics that will assist medical research.

Worm Watch Lab

The MRC have built tracking microscopes to record these videos of crawling worms. A USB microscope is mounted on a motorised stage connected to a computer. When the worm moves, the computer analyses the changing image and commands the stage to move to re-centre the worm in the field of view. Because the trackers work without supervision, they can run eight of them in parallel to collect a lot of video! It’s these movies that we need the public to help classify.

By watching movies of the nematode worms, we can understand how the brain works and how genes affect behaviour. The idea is that if a gene is involved in a visible behaviour, then mutations that break that gene might lead to detectable behavioural changes. The type of change gives us a hint about what the affected gene might be doing. Although it is small and has far fewer cells than we do, the worm used in these studies (called C. elegans) has almost as many genes as we do! We share a common ancestor with these worms, so many of their genes are closely related to human genes. This presents us with the opportunity to study the function of genes that are important for human brain function in an animal that is easier to handle, great for microscopy and genetics, and has a generation time of only a few days. It’s all quite amazing!

To get started visit www.wormwatchlab.org and follow the tutorial. You can also find Worm Watch Lab on Facebook and on Twitter.