What’s interesting? Or rather, what’s most interesting? This most fundamental of questions isn’t one we often directly address when thinking about scientific data, when we’re usually concerned with classification or deriving some global property of the data. But interestingness is important – in my own work with large surveys of the Universe, how interesting a new object is – an exploding star, or a strange galaxy – may determine whether we point telescopes at it, or whether it will languish, unobserved, in a catalogue for decades.

Hanny’s Voorwerp – a light echo lit up by activity in a now-faded quasar – was found early in the Galaxy Zoo project, providing a timely reminder of the importance of finding the unusual things in large datasets!

We’ve learnt how important serendipitous discoveries can be from previous astronomical Zooniverse projects, ranging from Galaxy Zoo’s Green Peas to Boyajian’s Star, ‘the most interesting star in the Milky Way’ (even if it turns out not to host an alien megastructure. With new projects such as the Vera Rubin Observatory’s LSST survey nearly ready to provide an unprecedented flood of information, astronomers around the world are honing their techniques for getting the most out of such large datasets – but the problem of preparing for surprise has been neglected.

In part because it turns out it’s hard to get funding for a search for the unusual, where by definition I can’t say in advance what it is that we’ll find. I’m therefore very pleased the team have received a new grant from the Alfred P. Sloan Foundation to build on the Zooniverse to provide tools designed for serendipity. My hunch is that, as we’ve learnt from so many Zooniverse projects before, a combination of human and machine intelligence is needed for the task; while modern machine learning is good at finding the unusual, working out which unusual things are actually interesting is best left to human intuition and intelligence.

If we think about being ‘unusual’ and being ‘interesting’ as different axes, an interesting space on which to plot our data appears. Modern machine learning is best suited to finding the unusual – but most unusual things are boring artefacts.

The project won’t stop at astronomy. In combination with Prof Kate Jones‘ team at UCL and elsewhere, we’ll look for surprises in audio recordings from ecological monitoring projects, testing whether identifying rare events – such as gunshots – might contribute to assessments of the health of an ecosystem. (You might remember Kate – she ran the Bat Detective project on the Zooniverse) And with the Science Scribbler team (particually Michele Darrow and Mark Basham) based at the Rosalind Franklin Institute we’ll use the latest high resolution imaging to use these techniques to spot structures in cells.

In doing all of this we can build on our galaxy-classifying Zoobot, the work on glitch identification from Gravity Spy (recent results!), friends and collaborators like Kate Storey-Fischer and Michelle Lochner, whose Astronomaly concept seems right up our street, and of course the insights and efforts of the two million strong Zooniverse army. Who knows what we might find together?


PS If you have a PhD in a relevant scientific discipline, or in computer science, then we’re advertising a postdoc – see here for details, or get in touch via Twitter or email to discuss.

One thought on “WHAT’S INTERESTING?”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s