I’m happy to announce that the Panoptes Client package for Python has finally reached version 1.0, after nearly a year and a half of development. With this package, you can automate the management of your projects, including uploading subjects, managing subject sets, and downloading data exports.
There’s still more work to do – I have lots of additional features and improvements planned for version 1.1 – but with the release of version 1.0, the Client has a stable set of core features which are useful for managing projects (both large and small).
I know a lot of people have already been using the 0.x versions while we’ve been working on them, so thanks to everyone who submitted feature requests, bug reports, and pull requests on GitHub. Please do upgrade to the latest version to make sure you have the latest bug fixes, and keep the requests and bug reports coming!
We recently had a very successful (and longer than usual) Stargazing Live. I wanted to talk a little about the work that our team did in the weeks leading up to this and also recap what actually happened behind the scenes during the two weeks of events.
If you’re not familiar with it, Stargazing Live is an annual astronomy TV show on BBC Two in the UK, which is broadcast live on three consecutive nights. Each year we launch a project in collaboration with the show, and this always proves to be the busiest time of our year. This year, for the first time there was a second week of shows for ABC Australia, so this time we launched two projects instead of one: Planet 9 and Exoplanet Explorers.
A lot of work went into making sure that our site stayed up for this year’s shows. In previous years we’ve had issues that have resulted in either a brief outage or reduced performance for at least some of the time during the show. This year everything worked perfectly and we actually found ourselves reducing our capacity (scaling down) much sooner than we anticipated. The prep work fell into three areas:
Reducing the load on our databases. We reduced the number of requests that result in database queries through caching in the backend (with memcache), and we started using a new microservice (called Designator) to keep track of what each user has seen and serve them new subjects. We also separated some services onto a read replica rather than having them query the primary database.
Adding feature flags so that we could turn off anything non-essential, and so that we could shut down any features that were causing problems, using the Flipper Ruby gem.
On the first night of the BBC show it was all hands on deck. Our teams in the US and the UK were in our offices, despite it being evening in the UK, and in Oxford we gathered around the TV expectantly awaiting the moment when Chris would announce the project’s URL on air. That moment is usually a bit frantic, as several thousand people all turn up on the site at once and start clicking around, registering, logging in, and submitting classifications. We’re always closely watching our monitoring systems, keeping an eye on various performance metrics, watching for any early signs of problems that might affect the performance of the site. This year when that moment came the number of visitors on site shot up to over 5,000, and then… everything just kept running smoothly.
The first night of the BBC show we peaked at about 0.9 million requests per hour, with 1.1 million per hour the second night.
We scaled our API service to 50 of EC2’s m3.medium instances the first night and the average CPU utilisation of these instances reached about 30% at peak traffic. The next two nights we reduced the number of instances to 40. In hindsight we could have gone even lower, but from past experience the amount of traffic we receive on the second and third nights can be difficult to predict, so we decided to play it safe.
Traffic during the ABC show was lower than during the BBC show (Australia has a smaller population than the UK, so this was as expected). That week we scaled the API to 40 instances the first night, and 20 instances for the second and third nights.
In the past we’ve had problems with running out of available connections in PostgreSQL. The connection limit depends on available memory, and we find this to be more of a problem than CPU or network constraints. During the shows we scaled the PostgreSQL instance for our main API to RDS’s m4.10xlarge and our Talk/microservices database to m4.2xlarge, primarily to give us enough leeway to avoid the connection limit. In the future we’d like to implement connection pooling to avoid this.
This was all a big improvement on previous years. While before we found ourselves extremely busy fighting fires and fixing bugs between shows, this time we had time to just relax and watch the show. We have more work to do on optimisations, because we did still have to scale up our capacity more than we’d like, but overall we’re very happy with how well things went this year.
We’re testing out a new feature of our interface, which means if you’re classifying images on Comet Hunters you may see occasional pop-up messages like the one pictured above.
The messages are designed to give you more information about the project. If you do not want to see them, you have the option to opt-out of seeing any future messages. Just click the link at the bottom of the pop-up.
You can have a look at this new feature by contributing some classifications today at www.comethunters.org.
We’re cleaning up our email list to make sure that we do not email anyone who does not want to hear from us. You will have got an email last week asking you if you want to stay subscribed. If you did not click the link in that email, then you will have received one today saying you have been unsubscribed from our main mailing list. Don’t worry! If you still want to receive notifications from us regarding things like new projects, please go to www.zooniverse.org/settings/email and make sure you’re subscribed to general Zooniverse email updates.
NOTE: This has not affected emails you get from individual Zooniverse projects.
We’re sorry to let you know that at 16:29 BST on Wednesday last week we made a change to the Panoptes code which had the unexpected result that it failed to record classifications on six of our newest projects; Season Spotter, Wildebeest Watch, Planet Four: Terrains, Whales as Individuals, Galaxy Zoo: Bar Lengths, and Fossil Finder. It was checked by two members of the team – unfortunately, neither of them caught the fact that it failed to post classifications back. When we did eventually catch it, we fixed it within 10 minutes. Things were back to normal by 20:13 BST on Thursday, though by that time each project had lost a day’s worth of classifications.
To prevent something like this happening in the future we are implementing new code that will monitor the incoming classifications from all projects and send us an alert if any of them go unusually quiet. We will also be putting in even more code checks that will catch any issues like this right away.
It is so important to all of us at the Zooniverse that we never waste the time of any of our volunteers, and that all of your clicks contribute towards the research goals of the project. If you were one of the people whose contributions were lost we would like to say how very sorry we are, and hope that you can forgive us for making this terrible mistake. We promise to do everything we can to make sure that nothing like this happens again, and we thank you for your continued support of the Zooniverse.
In late July I led a week-long course about crowdsourcing and data visualization at the Digital Humanities Oxford Summer School. I taught the crowdsourcing part, while my friend and collaborator, Sarah, from Google, lead the data visualization part. We had six participants from fields as diverse as history, archeology, botany and literature, to museum and library curation. Everyone brought a small batch of images, and used the new Zooniverse Project Builder (“Panoptes”) to create their own projects. We asked participants what were their most pressing research questions? If the dataset were larger, why would crowdsourcing be an appropriate methodology, instead of doing the tasks themselves? What would interest the crowd most? What string of questions or tasks might render the best data to work with later in the week?
Within two days everyone had a project up and running. We experienced some teething problems along the way (Panoptes is still in active development) but we got there in the end! Everyone’s project looked swish, if you ask me.
Participants had to ‘sell’ their projects in person and on social media to attract a crowd. The rates of participation were pretty impressive for a 24-hour sprint. Several hundred classifications were contributed, which gave each project owner enough data to work with.
But of course, a good looking website and good participation rates do not equate to easy-to-use or even good data! Several of us found that overly complex marking tasks rendered very convoluted data and clearly lost people’s attention. After working at the Zooniverse for over a year I knew this by rote, but I’d never really had the experience of setting up a workflow and seeing what came out in such a tangible way.
Despite the variable data, everyone was able to do something interesting with their results. The archeologist working on pottery shards investigated whether there was a correlation between clay color and decoration. Clay is regional, but are decorative fashions regional or do they travel? He found, to his surprise, that they were widespread.
In the end, everyone agreed that they would create simpler projects next time around. Our urge to catalogue and describe everything about an object—a natural result of our training in the humanities and GLAM sectors—has to be reined in when designing a crowdsourcing project. On the other hand, our ability to tell stories, and this particular group’s willingness to get to grips with quantitative results, points to a future where humanities specialists use crowdsourcing and quantitative methods to open up their research in new and exciting ways.
Anyone heading over to the Zooniverse today will spot a few changes (there may also be some associated down-time, but in this event we will get the site up again as soon as possible). There’s a new layout for the homepage, a few new projects have appeared and there’s a new area and a new structure to Talk to enable you to discuss the Zooniverse and citizen science in general, something we hope will bring together conversations that until now have been stuck within individual projects.
What you won’t see immediately is that the site is running on a new version of the Zooniverse software, codenamed ‘Panoptes‘. Panoptes has been designed so that it’s easier for us to update and maintain, and to allow more powerful tools for project builders. It’s also open source from the start, and if you find bugs or have suggestions about the new site you can note them on Github (or, if you’re so inclined, contribute to the codebase yourself). We certainly know we have a lot more to do; today is a milestone, but not the end of our development. We’re looking forward to continuing to work on the platform as we see how people are using it.
Panoptes allows the Zooniverse to be open in another way too. At its heart is a project building tool. Anyone can log in and start to build their own Zooniverse-style project; it takes only a moment to get started and I reckon not much more than half an hour to get to something really good. These projects can be made public and shared with friends, colleagues and communities – or by pressing a button can be submitted to the Zooniverse team for a review (to make sure our core guarantee of never wasting people’s time is preserved), beta test (to make sure it’s usable!), and then launch.
We’ve done this because we know that finding time and funding for web development is the bottleneck that prevents good projects being built. For the kind of simple interactions supported by the project builder, we’ve built enough examples that we know what a good and engaging project looks like. We’ll still build new and novel custom projects helping the Zooniverse to grow, but today’s launch should mean a much greater number of engaging and exciting projects that will lead to more research, achieved more quickly.
We hope you enjoy the new Zooniverse, and comments and feedback are very welcome. I’m looking forward to seeing what people do with our new toy.
PS You can read more about building a project here, about policies for which projects are promoted to the Zooniverse community here and get stuck into the new projects at www.zooniverse.org/#/projects.
PPS We’d be remiss if we didn’t thank our funders, principally our Google Global Impact award and the Alfred P. Sloan Foundation, and I want to thank the heroic team of developers who have got us to this point. I shall be buying them all beer. Or gin. Or champagne. Or all three.
Hi everyone, I’d like to let you know about a cool new project we are involved with. VOLCROWE is a three year research project funded by the Engineering and Physical Sciences Research Council in the UK, bringing together a team of researchers (some of which are already involved with the Zooniverse, like Karen Masters) from the Universities of Portsmouth, Oxford, Manchester and Leeds. The PI of the project Joe Cox says “Broadly speaking, the team wants to understand more about the economics of the Zooniverse, including how and why it works in the way that it does. Our goal is to demonstrate to the community of economics and management scholars the increasingly amazing things that groups of people can achieve when they work together with a little help from technology. We believe that Zooniverse projects represent a specialised form of volunteering, although the existing literature on the economics of altruism hasn’t yet taken into account these new ways in which people can give their time and energy towards not-for-profit endeavours. Working together with Zooniverse volunteers, we intend to demonstrate how the digital economy is making it possible for people from all over the world to come together in vast numbers and make a contribution towards tackling major scientific problems such as understanding the nature of the Universe, climate change and even cancer.
These new forms of volunteering exemplified by the Zooniverse fundamentally alter the voluntary process as it is currently understood. The most obvious change relates to the ways in which people are able to give their time more flexibly and conveniently; such as contributing during their daily commute using a smart phone! It also opens new possibilities for the social and community aspects of volunteering in terms of creating a digitally integrated worldwide network of contributors. It may also be the case that commonly held motivations and associations with volunteering don’t hold or work differently in this context. For example, religious affiliations and memberships may or may not be as prevalent as they are with more traditional or recognised forms of volunteering. With the help of Zooniverse volunteers, the VOLCROWE team are exploring all of these issues (and more) with the view to establishing new economic models of digital volunteering.
To achieve this aim, we are going to be interacting with the Zooniverse community in a number of ways. First, we’ll be conducting a large scale survey to find out more about its contributors (don’t worry – you do not have to take part in the survey or give any personal information if you do not want to!). The survey data will be used to test the extent to which assumptions made by existing models of volunteering apply and, if necessary, to formulate new ones. We’ll also be taking a detailed look at usage statistics from a variety of projects and will test for trends in the patterns of contributions across the million (and counting) registered Zooniverse volunteers. This larger-scale analysis will be supplemented with a number of smaller sessions with groups of volunteers to help develop a more nuanced understanding of people’s relationships with and within the Zooniverse. Finally, we’ll be using our expertise from the economic and management sciences to study the organisation of the Zooniverse team themselves and analyse the ways and channels they use to communicate and to make decisions. In short, with the help of its volunteers, we want to find out what makes the Zooniverse tick!
In the survey analysis, no information will be collected that could be used to identify you personally. The only thing we will ask for is a Zooniverse ID so that we can match up your responses to your actual participation data; this will help us address some of the project’s most important research questions. The smaller group and one-to-one sessions will be less anonymous by their very nature, but participation will be on an entirely voluntary basis and we will only ever use the information we gather in a way in which you’re comfortable. The team would really appreciate your support and cooperation in helping us to better understand the processes and relationships that drive the Zooniverse. If we can achieve our goals, we may even be able to help to make it even better!”
Keep an eye out for VOLCROWE over the coming weeks and months; they’d love you to visit their website and follow them on Twitter.
Anyone browsing the BBC News Technology section last night might have seen an unexpected appearance of a couple of our projects in this story about illegal streaming of Premier League football games. The story started on Saturday with an email from a volunteer pointing out that Virgin Media, a major Internet Service Provider in the UK, were blocking access to Notes From Nature. All is well now, but if you do experience problems please let us know. If you’d like the background, then read on.
In case you haven’t noticed I’ve had a pretty busy five years at the Zooniverse. With more than 25 projects launched in fields from astronomy to biodiversity and from climataology all the way to zoology, it’s been an incredible experience to work with so many new science teams hungry for answers to research questions that can only be answered by enlisting the help of a large number of volunteers. This model of citizen science, one where we boil down the often complex analysis task brought to us by a science team to the ‘simplest thing that will work’, build a rich user experience and then ask a bunch of people to help, seems to work pretty well.
For me, one of the best aspects of what I get to do is that I work in a domain that is an inherently open way of doing research. Having joined Zooniverse when we were still ‘just’ Galaxy Zoo, to see the range of projects we host broaden and to watch our community mature has been a remarkable experience. With our latest endeavour – the Galaxy Zoo Quench project – it’s clear that the line between the activites of the ‘science’ team and the ‘volunteers’ is becoming less defined by the day. Citizen-led science in the Zooniverse began with a group of people in the Galaxy Zoo Forum, ‘The Peas Corp’ when they discovered a new class of galaxy, and it continues today with volunteers discovering new types of worms, exotic exoplanets and even, through Quench, analysing and writing a new paper as a group. These of course are just examples I’ve taken from the Zooniverse and there are many more in other projects run by other people, but in each case the result is the same: by enagaing the public in a meaningful way Citizen Science is challenging the centuries old practices of academia and that has to be a good thing.
The opportunity to change the way science is done, whether it’s building software to increase efficiency or developing new collaboration models, is what brought me to the Zooniverse and now it’s what is leading me away. At the end of September this year I’m going to be hanging up my hat as Technical Lead of the Zooniverse and joining GitHub as their ‘science guy’.
As with all big decisions in life this wasn’t an easy one. I feel very fortunate to have had the opportunity to give technical direction to an incredible team of scientists, developers, educators and designers here at the Adler and the wider Zooniverse. But over the past couple of years I’ve also got to know a number of the GitHub folks and I’ve been hugely impressed by their focus on building the very best platform possible for online collaboration. Starting with the very simple idea that ‘it should be easier to work together than alone’ they’ve clearly nailed what it looks like to work on a problem with others in code. But software isn’t the only thing people are sharing on GitHub – legislators are publishing drafts of state law, technicians are documenting scientific laboratory protocols and with tools like the IPython Notebook researchers have defined formats and means of sharing entire research workflows.
The mantra of ‘collaborative versioned science’ has been rattling around my head now for a couple of years. I believe there’s an opportunity for GitHub to be the platform for capturing the process of scientific discovery and I want to help make that happen.
So what does this mean for the Zooniverse? Well, I’m leaving at a pretty good time as the Zooniverse has never been healthier – there’s a first-class web and education team of twelve people I’m going to be leaving behind at the Adler Planetarium in Chicago and we’ve just secured several large grants to expand our sister team at The University of Oxford to ten people (watch this space for job ads).
PS If you’d like to know more about what work looks like as a Technical Lead of the Zooniverse then I’ve written recently about some of the problems we’ve addressed over the past few years here, here and here.