Below is the first in a series of guest blog posts from researchers working on one of our recently launched biomedical projects, Etch A Cell.
Read on to let Dr Martin Jones tell you about the work they’re doing to further understanding of the universe inside our cells!
Having trained as a physicist, with many friends working in astronomy, I’ve been aware of Galaxy Zoo and the Zooniverse from the very early days. My early research career was in quantum mechanics, unfortunately not an area where people’s intuitions are much use! However, since I found myself working in biology labs, now at the Francis Crick Institute in London, I have been working in various aspects of microscopy – a much more visual enterprise and one where human analysis is still the gold standard. This is particularly true in electron microscopy, where the busy nature of the images means that many regions inside a cell look very similar. In order to make sense of the images, a person is able to assimilate a whole range of extra context and previous knowledge in a way that computers, for the most part, are simply unable to do. This makes it a slow and labour-intensive process. As if this wasn’t already a hard enough problem, in recent years it has been compounded by new technologies that mean the microscopes now capture images around 100 times faster than before.
Focused ion beam scanning electron microscope
Ten years ago it was more or less possible to manually analyse the images at the same rate as they were acquired, keeping the in-tray and out-tray nicely balanced. Now, however, that’s not the case. To illustrate that, here’s an example of a slice through a group of cancer cells, known as HeLa cells:
We capture an image like this and then remove a very thin layer – sometimes as thin as 5 nanometres (one nanometre is a billionth of a metre) – and then repeat… a lot! Building up enormous stacks of these images can help us understand the 3D nature of the cells and the structures inside them. For a sense of scale, this whole image is about the width of a human hair, around 80 millionths of a metre.
Zooming in to one of the cells, you can see many different structures, all of which are of interest to study in biomedical research. For this project, however, we’re just focusing on the nucleus for now. This is the large mostly empty region in the middle, where the DNA – the instruction set for building the whole body – is contained.
By manually drawing lines around the nucleus on each slice, we can build up a 3D model that allows us to make comparisons between cells, for example understanding whether a treatment for a disease is able to stop its progression by disrupting the cells’ ability to pass on its genetic information.
Animated gif of 3D model of a nucleus
However, images are now being generated so rapidly that the in-tray is filling too quickly for the standard “single expert” method – one sample can produce up to a terabyte of data, made up of more than a thousand 64 megapixel images captured overnight. We need new tricks!
Why citizen science?
With all of the advances in software that are becoming available you might think that automating image analysis of this kind would be quite straightforward for a computer. After all, people can do it relatively easily. Even pigeons can be trained in certain image analysis tasks! (http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0141357). However, there is a long history of underestimating just how hard it is to automate image analysis with a computer. Back in the very early days of artificial intelligence in 1966 at MIT, Marvin Minsky (who also invented the confocal microscope) and his colleague Seymour Papert set the “summer vision project” which they saw as a simple problem to keep their undergraduate students busy over the holidays. Many decades later we’ve discovered it’s not that easy!
Our project, Etch a Cell is designed to allow citizen scientists to draw segmentations directly onto our images in the Zooniverse web interface. The first task we have set is to mark the nuclear envelope that separates the nucleus from the rest of the cell – a vital structure where defects can cause serious problems. These segmentations are extremely useful in their own right for helping us understand the structures, but citizen science offers something beyond the already lofty goal of matching the output of an expert. By allowing several people to annotate each image, we can see how the lines vary from user to user. This variability gives insight into the certainty that a given pixel or region belongs to a particular object, information that simply isn’t available from a single line drawn by one person. Difference between experts is not unheard of unfortunately!
The images below show preliminary results with the expert analysis on the left and a combination of 5 citizen scientists’ segmentations on the right.
Example of expert vs. citizen scientist annotation
In fact, we can go even further to maximise the value of our citizen scientists’ work. The field of machine learning, in particular deep learning, has burst onto the scene in several sectors in recent years, revolutionising many computational tasks. This new generation of image analysis techniques is much more closely aligned with how animal vision works. The catch, however, is that the “learning” part of machine learning often requires enormous amounts of time and resources (remember you’ve had a lifetime to train your brain!). To train such a system, you need a huge supply of so-called “ground truth” data, i.e. something that an expert has pre-analysed and can provide the correct answer against which the computer’s attempts are compared. Picture it as the kind of supervised learning that you did at school: perhaps working through several old exam papers in preparation for your finals. If the computer is wrong, you tweak the setup a bit and try again. By presenting thousands or even millions of images and ensuring your computer makes the same decision as the expert, you can become increasingly confident that it will make the correct decision when it sees a new piece of data. Using the power of citizen science will allow us to collect the huge amounts of data that we need to train these deep learning systems, something that would be impossible by virtually any other means.
We are now busily capturing images that we plan to upload to Etch a cell to allow us to analyse data from a range of experiments. Differences in cell type, sub-cellular organelle, microscope, sample preparation and other factors mean the images can look different across experiments, so analysing cells from a range of different conditions will allow us to build an atlas of information about sub-cellular structure. The results from Etch a cell will mean that whenever new data arrives, we can quickly extract information that will help us work towards treatments and cures for many different diseases.