Category Archives: News

Galaxy Zoo is Open Source

It’s always a good feeling a be making a codebase open and today it’s time to push the latest version of Galaxy Zoo into the open. As I talked about in my blog post a couple of months ago, making open source code the default for Zooniverse is good for everyone involved with the project.

One significant benefit of making code open is that from here on out it’s going to be much easier to have Zooniverse projects translated into your favourite language. When we build a new project we typically extract the content into something called a localisation file (or localization if you prefer your en_US) which is basically just a plain text file that our application uses. You can view that file for our (US) English translation file here and it looks a little like this:

En

So how do I translate Galaxy Zoo?

I’m glad you asked… It turns out there’s a feature built into the code-hosting platform we’re using (called GitHub) which allows you to basically make your own copy of the Galaxy Zoo codebase. It’s called ‘forking’ and you can read much more about it here but all you need to do to contribute is fork the Galaxy Zoo code repository, add in your new translation file and (there’s a handy script that will generate a template file based on the English version), translate the English values into the new language and send the changes back up to GitHub.

Once you’re happy with the new translation and you’d like us to try it out you can send us a ‘pull request’ (details here). If everything looks good then we can review the changes and pull the new translation into the main Galaxy Zoo codebase. You can see an example of a pull request from Robert Simpson that’s been merged in here.

So what next?

This method of translating projects is pretty new for us and so we’re still finding our way a little here. As a bunch of developers it feels great to be using the awesome collaborative toolset that the GitHub platform offers to open up code and translations to you all.

Cheers

Arfon

Optimizing for interest : Why people aren’t machines

One of the joys of working in the Zooniverse is the sheer variety of people who are interested in our work, and I spent a happy couple of days toward the end of last year at a symposium about Discovery Infomatics – alongside a bunch of AI researchers and their friends who are trying to automate the process of doing science. I don’t think they’d mind me saying that we’re a long, long way from achieving that, but it was a good chance to muse on some of the connections between the work done by volunteers here and by our colleagues who think about machine learning.

In the past we’ve shown that machines can learn from us, but we’ve also talked about the need for a system that can combine the best of human and machine.

These two things are not the same
Robot and human (Thanks to Flickr user NineInchNachosXI)

I’m still convinced that that will especially be needed as the size of datasets produced by scientific surveys continues to increase at a frightening pace. The essential idea is that only the proportion of the data which really needs human attention need be passed to human classifiers; an idea that starts off as a non-brainer (wouldn’t it be nice if we could decide in advance which proportion of Galaxy Zoo systems are too faint or fuzzy for sensible decisions to be made?) and then becomes interestingly complex.

This is particularly true when you start thinking of volunteers not as a crowd, but as a set of individuals. We know from looking at the data from past projects that people’s talents are varied – the people who are good at identifying spiral arms, for example, may not be the same people who can spot the faintest signs of a merger. So if we want to be most efficient, what we should be aiming for is passing each and every person the image that they’d be best at classifying.

That in turn is easy to say, but difficult to deliver in practice. Since the days of the original Galaxy Zoo we’ve tended to shun anything that resembles a test before a volunteer is allowed to get going, and in any case a test which thoroughly examined someone’s ability in every aspect of the task (how do they do on bright galaxies? on faint ones? on distant spirals? on nearby ellipticals? on blue galaxies? what about mergers?) wouldn’t be much fun.

One solution is to use the information we already have; after all, every time someone provides a classification we learn something not only about the thing they’re classifying but also about them. This isn’t a new idea – in astronomy, I think it’s essentially the same as the personal equation used by stellar observers to combine results from different people – but things have got more sophisticated recently.

As I’ve mentioned before, a team from the robotics group in the department of engineering here in Oxford took a look at the classifications supplied by volunteers in the Galaxy Zoo: Supernova project and showed that by classifying the classifiers we could make better classifications. During the Discovery Infomatics conference I had a quick conversation with Tamsyn Waterhouse, a researcher from Google interested in similar problems, and I was able to share results from Galaxy Zoo 2 with her*.

We didn’t get time for a long chat, but I was delighted to hear that work on Galaxy Zoo had made it into a paper Tamsyn presented at a different conference. (You can read her paper here, or in Google’s open access repository here.) Her work, which is much wider than our project, develops a method which considers the value of each classification based (roughly) on the amount of information it provides, and then tries to seek the shortest route to a decision. And it works – she’s able to show that by applying these principles we would have been done with Galaxy Zoo 2 faster than we were – in other words, we wasted some people’s time by not being as efficient as we could be.

A reminder of what Galaxy Zoo 2 looked like!
A reminder of what Galaxy Zoo 2 looked like!

That doesn’t sound good – not wasting people’s time is one of the fundamental promises we make here at the Zooniverse (it’s why we spend a lot of time selecting projects that genuinely need human classifications). Zoo 2 was a long time in the past, but knowing what we know now should we be implementing a suitable algorithm for all projects from here on in?

Probably not. There are some fun technical problems to solve before we could do that anyway, but even if we could, I don’t think we should. The current state of the art of such work misses, I think, a couple of important factors which distinguish citizen science projects from other examples considered in Tamsyn’s paper particularly. To state the obvious: volunteer classifiers are different from machines. They get bored. They get inspired. And they make a conscious or an unconscious decision to stay for another classification or to go back to the rest of the internet.

The interest a volunteer will have in a project will change as they move (or are moved by the software) from image to image and from task to task, and in a complicated way. Imagine getting a galaxy that’s difficult to classify; on a good day you might be inspired by the challenge and motivated to keep going, on a bad one you might just be annoyed and more likely to leave. We all learn as we go, too, and so our responses to particular images change over time. The challenge is to incorporate these factors into whatever algorithm we’re applying so that we can maximise not only efficiency, but interest. We might want to show the bright, beautiful galaxies to everyone, for example. Or start simple with easy examples and then expand the range of galaxies that are seen to make the task more difficult. Or allow people a choice about what they see next. Or a million different things.

Whatever we do, I’m convinced we will need to do something; datasets are getting larger and we’re already encountering projects where the idea of getting through all the data in our present form is a distant dream. Over the next few years, we’ll be developing the Zooniverse infrastructure to make this sort of experimentation easier, looking at theory with the help of researchers like Tamsyn to see what happens when you make the algorithms more complicated, and talking to our volunteers to find out what they want from these more complicated projects – all in our twin causes of doing as much science as possible, while providing a little inspiration along the way.

* – Just to be clear, in both cases all these researchers got was a table of classifications without any way of identifying individual volunteers except by a number.

Project Workshop Winners

We were delighted by the response to our call for volunteers to attend our project workshop and we’re delighted to announce that our two winners are Katy Maloney and Janet Bain. Katy is a Planet Hunter from Montreal (you can see her in this recent video about online communities. Janet is well known to those from Old Weather where she serves as moderator of the very active forum.

As Jules explained in her post, these workshops are a chance for the strange mix of people behind the scenes of the Zooniverse – developers, educators and scientists – to get together to discuss what works and what doesn’t, and to plan the year ahead. We think it’s very important to have volunteers there – and we hope that Katy and Janet (along with Jules, who we’ve invited back) will keep you all informed and involved in the discussions.

There were a few comments in the discussion under that last post from people – particularly locals – who would clearly have dearly loved to come. Unfortunately, it wouldn’t be possible to run the workshop as a public event; both because of the format (which features spontaneously arranged small group discussions) and also to allow everyone to speak freely about often quite difficult issues. What I do find heartening is that we’ve grown a community who want to help us plan and develop for the future, and we need to take that seriously.

I’ll write more over the next couple of weeks and months about what we’re going to do to be more open, but for now for those who really wanted to come we’ll work hard to organise some truly public events. We have a meeting in Oxford on the 22nd June which I hope British Zooites will be able to attend, and we’ll arrange a similar event in Chicago as soon as possible. We’ll also try hard to webcast these events so all can attend.

Chris

ARCHIVE: Why SciStarter.com is Bad For Citizen Science

Since this post was written, SciStarter have changed their policies and now provide direct links to projects. This is a good thing, and I’m happy to acknowledge it here.

Chris – April 2017

Preface: I’d like to begin by saying that I’ve met Darlene Cavalier at conferences in the past and I’m a big supporter of her efforts. Darlene is truly is a ‘cheerleader’ for citizen science, her enthusiasm is infectious and the citizen science domain is clearly a better place with her. I’m writing here about what I consider the bad practice of SciStarter.com and Science For Citizens LLC, their parent organisation. I have no idea whether the issues highlighted here are because of decisions that she has made.

There was a time not so long ago when you needed a new account for pretty much everything you tried out on the web. Want to upload photos to Flickr? Then signup for a Yahoo! ID. Want a blog? Then give WordPress or Tumblr your details. Feeling social? Then FaceBook, Twitter or MySpace would pretty much want the same information. These days there are a number of solutions that allow you to log in to web-based services using things like your Facebook, Twitter or Google account. Under the hood these solutions typically rely on a couple of protocols such as OAuth and OpenID and often still request your email address when you sign in but the days of hundreds of accounts each with their own password to remember are coming to a close.

In many ways a request by an organision for your email address when signing up for a new service is completely reasonable. In exchange for handing over your email address and a few personal details these tools were often available for free – both parties win. There is of course the discussion around who or what is the product when you use these free services but let’s not go into that here.

Since launching the original Galaxy Zoo back in 2007 we’ve encouraged our volunteer community to register for an account with us, although for the vast majority of our projects (and all of our recent ones) this login/signup is an optional step. For the Zooniverse there are two main reasons for asking you to create an account:

1) When we publish a paper as a result of your efforts we feel extremely strongly about crediting you for your efforts. Experience has taught us that attempting to publish a paper with 170,000 authors on is somewhat frowned upon by the journals but if you take a look at any of the Zooniverse publications you’ll find a link to an authors page such as here, here and here. We can only credit you if you share some personal information with us when you sign up.

2) For our research methods to work well, identifying an individual ‘classifier’ is pretty important. You can read more about this here (the original Galaxy Zoo paper) or here but in order to produce the best results possible we spend lots of time working out who is ‘best’ at a particular task and weighting their contributions accordingly. Being able to reliably identify an individual throughout the lifetime of a project (and even between projects) is most simple when someone has logged in.

Over the past year or so I’ve become increasingly concerned by the behaviour of SciStarter.com – a website that indexes citizen science projects from across the web. The site does a pretty good job of cataloging citizen science projects you can contribute to – when you visit the site and search for example for ‘bats’ the Zooniverse project Bat Detective is listed in the results. Selecting the result takes you to a brief summary of Bat Detective and offers you a link to ‘get started now!’ and this is where it goes wrong: Rather than taking you straight to the Bat Detective site you have to be ‘logged in’. Sign up for what exactly? Am I signing up to take part in Bat Detective? No. You’re actually just signing up for an account with SciStarter.com just so you can get a link to a project that SciStarter.com has nothing to do with.

Additionally, in a recent ‘top 10’ blog post of most successful citizen science projects of 2012, Bat Detective was highlighted. Did the link in this article send you straight to the Bat Detective website? Sadly not, it of course links to SciStarter’s catalogue page about Bat Detective which requires account registration before you can access the URL.

To me this doesn’t seem right and in many ways this is just exploiting people’s lack of experience and understanding of the web. There’s a reason that Facebook.com is in the consistently the most Googled terms – many people just don’t quite understand how the web works and I think SciStarter.com are exploiting this. Conversly, for those who are a little more web savvy these tactics must seem very clumsy. Perhaps more importantly though, it’s widely recognised that signup forms are a barrier to entry for many people and so by having people jump through this hoop SciStarter.com are actually holding potential citizen scientists back.

I don’t believe it’s in anyone’s interest other than Scistarter’s to require you to sign up to follow a link through to a project. By mandating this step they are building an index of individuals interested in other people’s projects when they don’t have any of their own and they’re risking confusing new community volunteers about what they have and haven’t signed up for. All of this is made worse by the fact that SciStarter.com is a division of Science for Citizens LLC – a commercial company.

So my challenge to SciStarter.com is this: If you’re so committed to citizen science then why put up this artificial barrier to contribution? Crawling the internet for people’s emails is one of the less tasteful aspects of the web and one I’d hoped we’d seen the end of. So how about it SciStarter?

Calling all Zooites! Your chance to attend the second Zooniverse Project Workshop in Chicago!

Meg Schwamb giving the Planethunters presentation
Meg Schwamb giving the Planethunters presentation in 2012
Photo © Julia Wilkinson

It’s almost a year since I attended the first ever Zooniverse Project Workshop in my role as an advisory board member. In April the second Zooniverse workshop will convene to discuss yet more exciting new projects. I’ll be there and hopefully so will Alice Sheppard (if her exam timetable permits!) This year, however, there is funding available for one more volunteer to attend. This is a responsible role for a dedicated and enthusiastic Zooite and that could be you!

This is a fantastic opportunity to meet the science teams behind projects old and new and to find out just what is involved in getting a project up and running. You will attend some great presentations and have the chance to contribute to some fascinating discussions and workshops. Last year we covered things such as design, how to get the best science out of a project and how to create the best user experience. You need to be prepared to take part in discussions and to talk about your experiences as a Zooniverse volunteer. The more you put in the more rewarding the conference will be and you’ll find that your contribution will be hugely respected and valued. Volunteers can make or break a project and I was certainly made to feel that my input was extremely important.

There is only one place available, however, so to help the team decide who gets to go please tell us in no more than 250 words a little about yourself, why you think you should go and what you can contribute to the discussions as a volunteer. Please add your full name and preferred e-mail address and send this to team@zooniverse.org with the subject line CHICAGO PLEASE. The closing date is 12 noon GMT on Thursday 7 March 2013. The Zooniverse team will choose the successful entry.

The Adler Planetarium
The Adler Planetarium
Photo © Julia Wilkinson

The conference will be held over two days at the Adler Planetarium, Chicago on 29 and 30 April 2013. Flight and hotel expenses will be reimbursed in full.

This really is a fantastic opportunity to contribute to citizen science and the future of the Zooniverse – don’t miss out!

For a detailed account of last years event have a look at the notes on my blog.

Making the Zooniverse Open Source

We’re pleased to announce that the time has come to start making the Zooniverse open source. From today, you’ll be able to see several of our current projects on Github (at https://github.com/zooniverse) and will be able to fork them and contribute to them.

Taking the Zooniverse open source is something we’ve been thinking about for a long time. As the field of citizen science expands into ever broader domains the number of tools available to people to start their own projects is still low. Since the launch of Galaxy Zoo 2 we’ve been building tools that allow for code reuse across a number of projects and while the majority(1) of our software has never been ‘officially’ open, behind the scenes we’ve been sharing with pretty much anyone who asked, often talking them through the thought process that led us to design our software in a particular way.

Because of our natural inclination to share with those who approached us, we’ve never really made publishing our code a priority. As with most closed source projects there are also a number of pretty boring (but sometimes important) reasons for not publishing – we worried about how usable the code we’d written was to people we didn’t work closely with – as a small team we favour clean code and conversation with other developers over heavy documentation. Some sensitive information around our production environment inevitably slipped into the codebases which mean’t lots of work to clean up and security audit our tools. Some of these reasons hold for legacy applications each project we start often comes with a new Git repo and an opportunity to develop in a different way.

What does this mean?

Well, from today you’ll start to see a number of applications appear on the Zooniverse GitHub site. We’re starting with a collection of our most recent projects: Snapshot Serengeti, Bat Detective, Cyclone Center and Seafloor Explorer.

It’s important to say here that we’re not expecting a community of developers to jump in a help us develop new projects (although that would be pretty cool), but if there’s a typo on our site or a really annoying bug that you know exactly how to fix then fork the repo and send us a pull request and we’ll see what we can do. Significantly for our localisation support (translating sites into multiple languages) we’re proposing that new translations should be submitted in exactly this way (2). There are a huge number of very talented people in the Zooniverse community who until today had no way of contributing to the project other than to help analyse data. That changes today.

We’re releasing our software under a very liberal license – Apache 2.0. In very simple terms this means that the tools we develop can be used for whatever you like provided you follow the rules of the Apache 2.0 license.

What aren’t we open sourcing?

In truth lots of legacy code for our older projects aren’t likely to make it into the open. A large number of our projects between 2009 (Galaxy Zoo 2) and 2011 (The Milky Way Project) were all built upon a shared codebase called The Juggernaut. While we’re not making each of the projects open we’re are publishing the common application core which has been kept up to date and runs on Rails 3.1.

We’re also not opening up our applications that hold sensitive user information and are mission-critical for the operation of the Zooniverse. That’s not to say we won’t ever do this, we’re just not comfortable publishing these applications at this point. This basically means that the application that powers Zooniverse Home (www.zooniverse.org) and an application called Ouroboros (api.zooniverse.org) that serves up images and collects back classifications aren’t part of our open source strategy.

Why now?

Aside from the reasons mentioned above, there are a number of reasons to make open source our default position. In part it’s about people – developers these days are often hired (or at least shortlisted) by their GitHub profiles that show which projects they’ve been working on. As our team grows and we hire talented young developers we’re doing them a disservice not allowing them to show off the awesome work they do. It’s also about the way in which we as the Zooniverse do science. We believe citizen science is an inherently open way of doing research, we often work with open datasets (such as SDSS) and ask people to donate their time and efforts to a project that in the end produces open data products for the research community to enjoy (e.g. data.galaxyzoo.org, data.milkywayproject.org). Having a closed codebase for everything we do just feels incompatible with this way of doing research.

What’s next?

To be honest we’re not quite sure. Going forward, our projects will typically become open source as we launch them. If there’s a Zooniverse project that you think you’d like to rework for a different purpose then there’s now nothing stopping you from doing this. If you’re interested in helping us with a new translation for your favourite project then we’d love to talk. Perhaps you’re just interested to see how some of our applications work. Regardless, we invite you to take a look and give us feedback. The Zooniverse has always been about harnessing the crowd to make science happen. From today, there is a new way for people to contribute to that goal.

Cheers
Arfon

Footnotes:
1. Scribe, our open source text transcription framework grew out of Old Weather and has been used on a number of projects now.
2. A fuller article about language support is coming very soon on this blog.

We’re hiring – come help us build citizen history

I’ve been remiss in not posting our latest job advert on the blog – it’s a full-time developer position in Oxford for someone to lead our new collaboration with Imperial War Museum’s project to commemorate the first world war. This is an exciting chance to expand what we’ve been doing with projects like Old Weather and we hope that talented front-end developers will apply.

We’re looking for someone who can build beautifully in HTML5/CSS/Javascript, and who has an understanding of user interface design. If they’re good at working with large and diverse teams, that’d be a bonus too as they’ll be the main point of contact between Zooniverse and IWM. A background in developing highly-usable interfaces for web applications and experience of working with a modern web framework such as Ruby on Rails would be an advantage, as would a history in citizen science, history, science or any combination of the three.

Full details are here, but the upshot is that you’ve got until 5th March to apply.

Why the Zooniverse is easy to use.

A blog post from Adam Stevens today appeared in my Twitter stream, containing some discussion (and criticism) of the Zooniverse in general and Planet 4 in particular.

All debate is useful, so I wanted to respond to a few of the points made. Dispute about whether the main Planet 4 interface is any good scientifically should, I think, be settled by seeing if the team publish a paper with the results – our track record (in need of updating!) is here.

The meat of the post draws a distinction between ‘real science’ – by which I assume the author means analysis, paper writing and so on – and what the main Zooniverse interfaces do, which is described as ‘data analysis’. We’ve been here before, and part of the answer is the same one I gave then : data analysis and classification is as much a part of science as solving an equation, and while there may be scientists who do nothing but think grand analytic thoughts, I’ve never met any.

However, there’s another part to the answer. Zooniverse projects are explicitly designed so that even a brief interaction with the site produces meaningful results. This is partly pragmatic (as this post from our Old Weather project shows, as a rule of thumb half of contributions come from people who only do a few) but it also because we truly believe in the transformational nature of having someone do something real. Those visiting Zooniverse for the first time are typically not scientists; often they are not yet even fans of science. We know from anecdote and from our own research that for many of these people doing something simple that makes a contribution to our understanding of the Universe is very fulfilling, often unexpectedly so.

More than that, these projects act as engines of motivation. Once people have found their feet in the main interface, once people have got used to the idea that science is now an activity they can participate in, once people are excited to further investigate interesting images and objects that are now theirs, wonderful things happen.

There are great examples from many projects, but on Twitter I pointed to our recent Planet Hunters paper which reported one new confirmed planet and 42 new planet candidates (with greater than 90% certainty of being real) which were discovered by the community active on our Talk discussion tool.

Many of these volunteers (including Kian Jek, who was just awarded the Chambliss prize for achievement by the AAS) are doing far, far more than just using the Planet Hunters interface. But they’re there because they were drawn in by the proposition of the initial site. For many, the motivation to learn about classes of variable stars and the minutia of transits came only after they’d found something special, and for many the confidence to attack these more detailed questions comes from the initial, guided experience.

As technical supremo Arfon put it on Twitter, the Zooniverse is a set of analysis tasks where scientists need help, and where they will analyze results and report back, but if you’ll come with us there’s a whole world of conversation and discovery that can happen. Drawing a distinction between the two misses the point – without the former, participation in the latter (the ‘real science’, if you must) is limited to those who already have the confidence to participate.

Chris

PS Adam did suggest a specific change: that, as one of the main science goals of Planet 4 is to measure wind speed we should add an arrow allowing people to indicate the wind speed and direction. This seems to me misguided; we’re getting that information from the task that the volunteers are doing in marking the shape, size and direction of the fans. You could add further pedagogical material early on, but this would likely reduce the number of people who make it to the ‘ah ha! I’m doing science!’ moment because we know that it’s very easy to trigger an adverse reaction in the form of a loss of confidence when we ask slightly more abstract questions in the initial phase of engagement with a project. In any case, inference follows measurement – and we’re still at the measurement stage in this strange and fascinating region of Mars.

PPS In the main post, I’ve ignored comments about the relationship between the BBC’s Stargazing Live program and Planet4. It’s important to realize that the driving force behind the Planet 4 project is Candy Hansen and her team of Martian scientists – ironically, we’d discussed a version of the idea while I was interviewing her for the Sky at Night about 18 months ago. That’s before Planet Hunters was on TV, so it’s dead wrong to say that Planet 4 was cooked up in response to a desire to have something else to do on telly. If there were inaccuracies on camera, I can only plead that live television is tricky and the real test is whether the project produces papers – which will, as any real scientist knows, take time! Stargazing’s commitment to real engagement instead of ‘educational experiments’ is, I think, a huge strength of the series: Here’s the latest news on the planet candidate identified in the 2012 series.

Planet Four and Stargazing Live

Tonight is the start of the 2013 round of the wonderful BBC Stargazing Live in the UK. Three nights of primetime astronomy programmes, hosted live from the iconic Jodrell Bank. Last year the Zooniverse asked the Stargazing Live viewers to find an exoplanet via Planet Hunters (and they did!). This year we want everyone to scour the surface of Mars on our brand new site: Planet Four.

Every Spring on Mars geysers of melting dry ice erupt through the planet’s ice cap and create ‘fans’ on the surface of the Red Planet. These fans can tell us a great deal about the climate and surface of Mars. Using amazing high-resolution imagery from the Mars Reconnaissance Orbiter (MRO) researchers have spent months manually marking and measuring the fans to try and create a wind map of the Martian surface, amongst other things. They’ve now teamed up with us to launch Planet Four, where everyone can help measure the fans and explore the surface of Mars.

Planet Four

The task on Planet Four is to find and mark ‘fans’, which usually spear as dark smudges on the Martian surface. These are temporary features and they tell you about the wind speed and direction on Mars as they were formed. They are created by CO2 geysers erupting through the surface as the temperature increases during Martian Spring. These geysers of rapidly sublimating material sweep along dust as they go, leaving behind a trail.

Classifying fans on Planet Four

The fans are just one feature that you’ll see. The image above shows some great ‘spiders’, with frost around their edges. There’s lots to see, and hopefully the audience of Stargazing Live will help us blast through the data really quickly.

Stargazing Live begins at 8pm on BBC2. If you can’t watch it live then why not hop onto Twitter and follow the #bbcstargazing hashtag? You’ll also find Planet Four and the Zooniverse on Twitter as well.

740,000 People – Part Two

Volunteers Poster

To end our 2012 advent calendar, we have the second of our 740,000 posters. We’d like to wish everyone a happy holiday – whatever you do at this time of year! We’ll be back in 2013 with more news, new projects and more science based on your work. The Universe is too big to explore without you.