Category Archives: Technical

Fixed cross-site scripting vulnerability on project home pages

We recently fixed a security vulnerability in the way project titles are handled on project home pages. Prior to this it was possible to create a project which included Javascript in its name, and thus inject code into the page. After investigating this incident, we have determined that this vulnerability has not been exploited for any malicious purpose; no data was leaked and no users were exposed to injected code.

This vulnerability was reported to us on June 20, 2018, by Lacroute Serge. We began testing fixes around three hours later, which were deployed about 15 hours after the original report, on June 21, 2018.

The fixes for this vulnerability are contained in pull requests #4710 and #4711 for the Panoptes Front End project on GitHub. Anyone running their own hosted copy of this should pull these changes as soon as possible.

We have investigated the cause and assessed the impact of this vulnerability. A summary of what we found follows:

  • No data was leaked as a result of this vulnerability. The vulnerability was not exploited for any malicious purpose and there was no unauthorised access to any of our systems.
  • The vulnerability was introduced on September 12, 2017, in a change which was part of our work to allow projects to be translated into multiple languages.
  • We found three projects that contained exploits for this vulnerability (not including projects created by our own team for testing purposes): two were created before the vulnerability was introduced, so the exploit wouldn’t have worked at the time they were created (it might have worked if the projects were visited between September 12, 2017, and June 21, 2018, but no-one did so); the remaining project was created by the security researcher who reported the vulnerability.
  • Our audit included previous titles for projects (all changes to projects are versioned, so we were able to audit any project titles which have since been changed).
  • All three projects contained only benign code to display a JavaScript alert box. None of them attempted to perform any malicious actions.
  • No users other than the project owner and members of our development team visited any of these projects, so no other users activated any of the exploits.

We’d like to thank Lacroute Serge for reporting this vulnerability to us via the method detailed on our security page, following responsible disclosure by reporting it to us in private to give us the opportunity to fix it.

Panoptes CLI 1.0.1 and Panoptes Client for Python 1.0.1

We’ve recently released updates for the Panoptes command-line interface and the Panoptes Client module for Python containing a few bug fixes.

From the changelog for Panoptes Client:

  • Fix: Exports are not automatically decompressed on download
  • Fix: Unable to save a Workflow
  • Fix: Fix typo in documentation for Classification
  • Fix: Fix saving objects initialised from object links

And from the CLI:

  • Fix: Modifying projects makes them private

You can install the updates by running pip install -U panoptescli and pip install -U panoptes-client.

Panoptes CLI 1.0, a command-line interface for managing projects

Following on from the release of Panoptes Client 1.0 for Python, we’ve just released version 1.0 of the Panoptes CLI. This is a command-line client for managing your projects, because some things are just easier in a terminal! The CLI lets you do common project management tasks, such as activating workflows, linking subject sets, downloading data exports, and uploading subjects. Let’s jump in with a few examples.

First, downloading a classification export (obviously you’d insert your own project ID and a filename of your choice):

panoptes project download 764 Downloads/pulsar-hunters-classifications.csv

cli-classification-download.gif

This command will optionally generate a new export and wait for it to be ready before downloading. No more waiting for the notification email!

New subjects can be uploaded to a new subject set like so (again, inserting your own IDs):

panoptes subject-set create 7 "November 2017 subjects"
panoptes subject-set upload-subjects 16401 manifest.csv

cli-subject-upload.gif

You can also pipe the output from the CLI into other standard commands to do more powerful things, such as linking every subject set in your project to a workflow using the xargs command (where 1234 and 5678 are your project ID and workflow ID respectively):

panoptes subject-set ls -q -p 1234 | xargs panoptes workflow add-suject-sets 5678

Visit GitHub to get started with the CLI today!

Introducing Panoptes Client 1.0 for Python

I’m happy to announce that the Panoptes Client package for Python has finally reached version 1.0, after nearly a year and a half of development. With this package, you can automate the management of your projects, including uploading subjects, managing subject sets, and downloading data exports.

There’s still more work to do – I have lots of additional features and improvements planned for version 1.1 – but with the release of version 1.0, the Client has a stable set of core features which are useful for managing projects (both large and small).

I know a lot of people have already been using the 0.x versions while we’ve been working on them, so thanks to everyone who submitted feature requests, bug reports, and pull requests on GitHub. Please do upgrade to the latest version to make sure you have the latest bug fixes, and keep the requests and bug reports coming!

You can find installation and upgrade instructions on GitHub, and full documentation on Read the Docs.

Stargazing Live 2017 Recap

We recently had a very successful (and longer than usual) Stargazing Live. I wanted to talk a little about the work that our team did in the weeks leading up to this and also recap what actually happened behind the scenes during the two weeks of events.

If you’re not familiar with it, Stargazing Live is an annual astronomy TV show on BBC Two in the UK, which is broadcast live on three consecutive nights. Each year we launch a project in collaboration with the show, and this always proves to be the busiest time of our year. This year, for the first time there was a second week of shows for ABC Australia, so this time we launched two projects instead of one: Planet 9 and Exoplanet Explorers.

A lot of work went into making sure that our site stayed up for this year’s shows. In previous years we’ve had issues that have resulted in either a brief outage or reduced performance for at least some of the time during the show. This year everything worked perfectly and we actually found ourselves reducing our capacity (scaling down) much sooner than we anticipated. The prep work fell into three areas:

  • Optimisations to the frontend to reduce the number of API calls made by the site while people were using it. This involved a combination of refactoring, fixing bugs, and modifying the backend to return frequently requested data without it having to be requested separately (e.g. when checking if the user has favourited a subject).
  • Reducing the load on our databases. We reduced the number of requests that result in database queries through caching in the backend (with memcache), and we started using a new microservice (called Designator) to keep track of what each user has seen and serve them new subjects. We also separated some services onto a read replica rather than having them query the primary database.
  • Adding feature flags so that we could turn off anything non-essential, and so that we could shut down any features that were causing problems, using the Flipper Ruby gem.
The Oxford team gathers in the office to watch the show.

On the first night of the BBC show it was all hands on deck. Our teams in the US and the UK were in our offices, despite it being evening in the UK, and in Oxford we gathered around the TV expectantly awaiting the moment when Chris would announce the project’s URL on air. That moment is usually a bit frantic, as several thousand people all turn up on the site at once and start clicking around, registering, logging in, and submitting classifications. We’re always closely watching our monitoring systems, keeping an eye on various performance metrics, watching for any early signs of problems that might affect the performance of the site. This year when that moment came the number of visitors on site shot up to over 5,000, and then… everything just kept running smoothly.

The first night of the BBC show we peaked at about 0.9 million requests per hour, with 1.1 million per hour the second night.

Requests to Zooniverse.org during BBC Stargazing Live 2017.

We scaled our API service to 50 of EC2’s m3.medium instances the first night and the average CPU utilisation of these instances reached about 30% at peak traffic. The next two nights we reduced the number of instances to 40. In hindsight we could have gone even lower, but from past experience the amount of traffic we receive on the second and third nights can be difficult to predict, so we decided to play it safe.

API scaling and CPU utilisation during BBC Stargazing Live 2017.

Traffic during the ABC show was lower than during the BBC show (Australia has a smaller population than the UK, so this was as expected). That week we scaled the API to 40 instances the first night, and 20 instances for the second and third nights.

In the past we’ve had problems with running out of available connections in PostgreSQL. The connection limit depends on available memory, and we find this to be more of a problem than CPU or network constraints. During the shows we scaled the PostgreSQL instance for our main API to RDS’s m4.10xlarge and our Talk/microservices database to m4.2xlarge, primarily to give us enough leeway to avoid the connection limit. In the future we’d like to implement connection pooling to avoid this.

This was all a big improvement on previous years. While before we found ourselves extremely busy fighting fires and fixing bugs between shows, this time we had time to just relax and watch the show. We have more work to do on optimisations, because we did still have to scale up our capacity more than we’d like, but overall we’re very happy with how well things went this year.

 

Pop-ups on Comet Hunters

pasted-image-at-2016_10_20-11_05-am

 

We’re testing out a new feature of our interface, which means if you’re classifying images on Comet Hunters you may see occasional pop-up messages like the one pictured above.

The messages are designed to give you more information about the project. If you do not want to see them, you have the option to opt-out of seeing any future messages. Just click the link at the bottom of the pop-up.

You can have a look at this new feature by contributing some classifications today at www.comethunters.org.

Emails from the Zooniverse

pasted-image-at-2016_09_16-02_48-pm
Click this image to be taken to your Zooniverse email settings

We’re cleaning up our email list to make sure that we do not email anyone who does not want to hear from us. You will have got an email last week asking you if you want to stay subscribed. If you did not click the link in that email, then you will have received one today saying you have been unsubscribed from our main mailing list. Don’t worry! If you still want to receive notifications from us regarding things like new projects, please go to www.zooniverse.org/settings/email and make sure you’re subscribed to general Zooniverse email updates.
NOTE: This has not affected emails you get from individual Zooniverse projects.

Lost Classifications

We’re sorry to let you know that at 16:29 BST on Wednesday last week we made a change to the Panoptes code which had the unexpected result that it failed to record classifications on six of our newest projects; Season Spotter, Wildebeest Watch, Planet Four: Terrains, Whales as Individuals, Galaxy Zoo: Bar Lengths, and Fossil Finder. It was checked by two members of the team – unfortunately, neither of them caught the fact that it failed to post classifications back. When we did eventually catch it, we fixed it within 10 minutes. Things were back to normal by 20:13 BST on Thursday, though by that time each project had lost a day’s worth of classifications.

To prevent something like this happening in the future we are implementing new code that will monitor the incoming classifications from all projects and send us an alert if any of them go unusually quiet. We will also be putting in even more code checks that will catch any issues like this right away.

It is so important to all of us at the Zooniverse that we never waste the time of any of our volunteers, and that all of your clicks contribute towards the research goals of the project. If you were one of the people whose contributions were lost we would like to say how very sorry we are, and hope that you can forgive us for making this terrible mistake. We promise to do everything we can to make sure that nothing like this happens again, and we thank you for your continued support of the Zooniverse.

Sincerely,

The Zooniverse Team

Crowdsourcing and basic data visualization in the humanities

In late July I led a week-long course about crowdsourcing and data visualization at the Digital Humanities Oxford Summer School. I taught the crowdsourcing part, while my friend and collaborator, Sarah, from Google, lead the data visualization part. We had six participants from fields as diverse as history, archeology, botany and literature, to museum and library curation. Everyone brought a small batch of images, and used the new Zooniverse Project Builder (“Panoptes”) to create their own projects. We asked participants what were their most pressing research questions? If the dataset were larger, why would crowdsourcing be an appropriate methodology, instead of doing the tasks themselves? What would interest the crowd most? What string of questions or tasks might render the best data to work with later in the week?

Within two days everyone had a project up and running.  We experienced some teething problems along the way (Panoptes is still in active development) but we got there in the end! Everyone’s project looked swish, if you ask me.

Digging the Potomac

Participants had to ‘sell’ their projects in person and on social media to attract a crowd. The rates of participation were pretty impressive for a 24-hour sprint. Several hundred classifications were contributed, which gave each project owner enough data to work with.

But of course, a good looking website and good participation rates do not equate to easy-to-use or even good data! Several of us found that overly complex marking tasks rendered very convoluted data and clearly lost people’s attention. After working at the Zooniverse for over a year I knew this by rote, but I’d never really had the experience of setting up a workflow and seeing what came out in such a tangible way.

Despite the variable data, everyone was able to do something interesting with their results. The archeologist working on pottery shards investigated whether there was a correlation between clay color and decoration. Clay is regional, but are decorative fashions regional or do they travel? He found, to his surprise, that they were widespread.

In the end, everyone agreed that they would create simpler projects next time around. Our urge to catalogue and describe everything about an object—a natural result of our training in the humanities and GLAM sectors—has to be reined in when designing a crowdsourcing project. On the other hand, our ability to tell stories, and this particular group’s willingness to get to grips with quantitative results, points to a future where humanities specialists use crowdsourcing and quantitative methods to open up their research in new and exciting ways.

-Victoria, humanities project lead

A whole new Zooniverse

Anyone heading over to the Zooniverse today will spot a few changes (there may also be some associated down-time, but in this event we will get the site up again as soon as possible). There’s a new layout for the homepage, a few new projects have appeared and there’s a new area and a new structure to Talk to enable you to discuss the Zooniverse and citizen science in general, something we hope will bring together conversations that until now have been stuck within individual projects.

Our new platform, Panoptes, is name after Argus Panoptes, a many-eyed giant form Greek mythology.
Our new platform, Panoptes, is named after Argus Panoptes, a many-eyed giant form Greek mythology. Image credit: http://monsterspedia.wikia.com/wiki/File:Argus-Panoptes.jpg

What you won’t see immediately is that the site is running on a new version of the Zooniverse software, codenamed ‘Panoptes‘. Panoptes has been designed so that it’s easier for us to update and maintain, and to allow more powerful tools for project builders. It’s also open source from the start, and if you find bugs or have suggestions about the new site you can note them on Github (or, if you’re so inclined, contribute to the codebase yourself). We certainly know we have a lot more to do; today is a milestone, but not the end of our development. We’re looking forward to continuing to work on the platform as we see how people are using it.

Panoptes allows the Zooniverse to be open in another way too. At its heart is a project building tool. Anyone can log in and start to build their own Zooniverse-style project; it takes only a moment to get started and I reckon not much more than half an hour to get to something really good. These projects can be made public and shared with friends, colleagues and communities – or by pressing a button can be submitted to the Zooniverse team for a review (to make sure our core guarantee of never wasting people’s time is preserved), beta test (to make sure it’s usable!), and then launch.

We’ve done this because we know that finding time and funding for web development is the bottleneck that prevents good projects being built. For the kind of simple interactions supported by the project builder, we’ve built enough examples that we know what a good and engaging project looks like. We’ll still build new and novel custom projects helping the Zooniverse to grow, but today’s launch should mean a much greater number of engaging and exciting projects that will lead to more research, achieved more quickly.

We hope you enjoy the new Zooniverse, and comments and feedback are very welcome. I’m looking forward to seeing what people do with our new toy.

Chris

PS You can read more about building a project here, about policies for which projects are promoted to the Zooniverse community here and get stuck into the new projects at www.zooniverse.org/#/projects.

PPS We’d be remiss if we didn’t thank our funders, principally our Google Global Impact award and the Alfred P. Sloan Foundation, and I want to thank the heroic team of developers who have got us to this point. I shall be buying them all beer. Or gin. Or champagne. Or all three.