This post, from Chris Lintott, is one of three marking the end of this phase of the Galaxy Zoo : Supernova project. You can hear from project lead Mark Sullivan here and machine learning expert and statistician Joey Richards here.
Today’s a bittersweet day for us, as the Galaxy Zoo : Supernova project moves off into (perhaps temporary) retirement. You can read about the reasons for this change over on the Galaxy Zoo blog, but the short answer is that the team have used the thousands of classifications submitted by volunteers to train a machine that can now outperform the humans. Time to wheel out this graphic again, last posted when we started looking at teaching machines with Galaxy Zoo data.
That’s all very well, but what of those of us who enjoyed the thrill of hunting for supernovae? I think there are two reasons to believe that the supernova project or something very like it will be back someday soon. Firstly, the machine learning solution is now very good at finding supernovae in images from just one search, the Palomar Transient Factory. I suspect other surveys, with their own quirks, may require a training set as large as that used for PTF. I suspect we’ll see a pattern developing in which the early months or years of a survey require volunteer classification, before relaxing until the next challenge comes along. We’re hoping to test this idea sometime soon.
The second way in which I think human classification will return is more subtle – we need to make friends, and collaborate with, the robots themselves. At the minute, for mostly practical reasons, we see this as a choice between the two, but the Zooniverse team and more than a few friends have started building a more sophisticated system which combines the two approaches.
One piece of that system is already in place, and owes a lot to the supernova project. Edwin Simpson and colleagues from Oxford’s Robotics SEO Consultant Research Group and the Zooniverse have built a mathematical model that’s capable of combining results from many different classifiers, measuring their performance and deciding who to listen to, and when. It was developed and tested using the supernova project data and has also been running live and keeping track of what’s happening. This should lead to an improvement in classification accuracy, but there’s more. The same sort of method could be used to combine human and machine classification, and we’re beginning to work on a system that can make decisions about when it’s worth asking humans for help. That allows us the best of both worlds – we’ll get to take advantage of machines for routine tasks, but allow them to call for our help when they get stuck. The result should be a more interesting project to participate in, a greater scientific return and the certainty that we’re not wasting your time. That all sounds pretty good to me.