Tag Archives: research

Who’s who in the Zoo – Patricia Smith

In this edition of Who’s who in the Zoo, meet Patricia, Community Manager of the Science Scribbler organisation.


Who: Patricia Smith, Community Manager

Location: The Rosalind Franklin Institute, Harwell Campus, Oxfordshire, UK

Zooniverse project: The Science Scribbler organisation

(photo credit: Ryan Cowan)

What is your research about?

As a community manager, I wear a lot of different hats! My formal background is materials science and biomaterials, but I’m now the ‘citizen science specialist’ in a lot of my day-to-day research. I work alongside imaging specialists, software engineers, and experts in a variety of biosciences to help them design interesting, effective, and worthwhile projects on the Zooniverse. Essentially, I make sure that the experts are asking the right questions, in the right way, for our volunteers to be able to understand and contribute most effectively to our research.

I also spend a lot of time supporting our Science Scribbler community and making sure our volunteers are the first to hear about any project updates or research outcomes. The rest of my time is spent working with teachers to support them in using citizen science in the classroom through our Virus Factory in Schools project, and dabbling in a little bit of my own research too.

How do Zooniverse volunteers contribute to your research?

Most of the Science Scribbler projects launched so far have focused on 3D biological imaging data. When we ask questions about a particular sub-cellular structure or disease, we usually have to go through a process called segmentation: essentially colouring in every pixel that we count as being part of a particular class or label. Automated segmentation methods are constantly improving, but most of the time they still require a lot of expert annotation to either train or finetune the segmentation model. Creating this annotation is a huge bottleneck in processing all the data we collect. As a consequence, we usually have to compromise in some way: looking at a smaller sample size or asking less complicated questions.

Where volunteers help us in our research is in providing the annotations we need to train or refine our segmentation models. Once we have segmentation models that are working well, we can start to ask the really interesting questions – like what differences can we see in the mitochondria of healthy or diseased placenta? And what does that mean for our understanding of that disease?

But using citizen science to train or finetune our models isn’t just about passing the workload from a researcher to the crowd – it’s so much more powerful than that. One thing I’m really interested in is how citizen science can impact the bias in our models. If one expert trains a model, it will ‘see’ what that one individual sees. But if a model is trained on thousands of eyes through citizen science, it has the potential to be less biased than the expert, and who knows what that will bring!

What’s a surprising or fun fact about your research field?

We collect a lot of data at the Rosalind Franklin Institute. Recently we celebrated reaching 1 petabyte of Franklin data with a petabyte party (yes, there was cake). A petabyte is one million gigabytes – a huge amount of data for anyone to analyse – hence why we know citizen science is so valuable in our research. But what astounds me is how biology is at a completely different level; you can store roughly 215 petabytes of data in just 1 gram of DNA. Mind: blown.

What first got you interested in research?

I’m very lucky that I was exposed to a lot of science and engineering from a very early age. I think I decided I’d be a biochemist when I was just 9 years old, but in the end materials science stole my heart! There’s something fundamentally rewarding about being able to look at my everyday environment and ask: “How does this work?”, “What is this made of?” and most importantly “Why????”

In my role I’ve learned a lot about the impact science capital can have on a child’s attitude towards science and STEM careers. It’s part of why I think science communication is so important, and why I chose to work in a position that allows me to share my love of science with so many people.

What’s something people might not expect about your job or daily routine?

We livestream citizen science on Twitch!

Outside of work, what do you enjoy doing?

I really enjoy hiking and skiing in the alps, DnD, board games, and a good flat white. I also spent a decade dedicating half my time to rowing – when I started this role I was working part-time alongside training as a full-time athlete.

What are you favourite citizen science projects?

Too many to count! I’m always very nosey when a new project launches on the Zooniverse, so I try to submit at least a few classifications for each one. I really like using the Zooniverse app, so Gwitch Hunters comes to mind there. I also really enjoy the Etch A Cell projects, HMS NHS, and Monkey Health Explorer. The first project I contributed to was Civil War Bluejackets. Following the progress on the project over the last 3 years has been really easy thanks to their amazing blog and newsletters. They recently moved from full transcription (which I did a lot of) to correcting the automated transcriptions that were trained on our original work. It’s really cool to see the project progress in real time like that!

What guidance would you give to other researchers considering creating a citizen research project?

Getting a fresh pair of eyes on your data is really important in project design – sometimes you know the data too well and you’ll be blind to some really simple changes that will make your workflows much more straightforward. Remember to provide positive and negative examples – not just what you should do, but what you shouldn’t do as well. Finally, be ready to respond to your community in the early stages of the project. The first few weeks are really where you build out your FAQs and refine your field guide – especially if your volunteers find unusual examples in your dataset!

Is there anything else you would like to share with our readers?

I wanted to say a huge thank you to our Science Scribbler community! Since our first project launched in 2018, you have contributed over 4.4 million classifications to our projects. That’s the equivalent of 10 years of effort from a full-time employee!

Who’s who in the Zoo – Hillary Burgess

In this edition of Who’s who in the Zoo, meet Hillary, a member of our team who is involved in our work exploring the ethics of machine learning in public-engaged research.


Who: Hillary Burgess

Zooniverse project: Ethical Considerations for Machine Learning in Public-Engaged Research

What is your research about?

I am a longtime enthusiast of participatory science. This enthusiasm has led me to wear many different hats in this space – from project designer and lead, to volunteer, to researcher studying theory and practice of public engaged science. I’m currently supporting an effort to develop recommendations for running AI-engaged projects on the Zooniverse platform. As A.I., particularly machine learning, becomes more prevalent as a research tool and in other aspects of society, there is a mix of worry and excitement among the Zooniverse community. The recommendations will be responsive to the interests and concerns raised by Zooniverse stakeholders and will integrate best practices and learnings from the broader community. This involves engaging with experts in communications and ethical use of technology, Zooniverse leadership, and Zooniverse volunteers.

How do Zooniverse volunteers contribute to your research?

Zooniverse volunteers are the reason for Zooniverse. We want to hear from as many volunteers as possible, so we can move forward in a way that reflects the diverse experiences and perspectives of this community. In fact, this initiative was born out of concerns about the use of A.I. on the Zooniverse platform. The funding Zooniverse received from the Kavli Foundation allows us to convene a series of four workshops to hear from a variety of stakeholders, including a few volunteers. But because the capacity for those workshops is small and not everyone wants to engage in a workshop format, we’re also sending out four short surveys for volunteers. Survey responses are feeding directly into our planning, and will be a key inspiration for the final recommendations for A.I. engaged projects on the Zooniverse platform. We need input every step of the way. Volunteers are also invited to share their perspective on Talk. We have had a phenomenal response to the first two surveys from over 1000 volunteers. Some of the questions are open-ended and I am fascinated and inspired by the diversity of opinion in these responses! Some people are really excited by the thought that they could contribute to machine learning, and a higher pace of progress toward research outcomes they care about. Others are deeply concerned about the potential for data quality issues and the environmental impacts associated with energy demand from running big models. Some express both, and all are valid and important to hear as we navigate this new frontier. As a relative newcomer to the Zooniverse community, reading the replies have given me many AHA! moments about what motivates people to participate in Zooniverse projects, and enormous appreciation for the passion and expertise among volunteers.

What’s a surprising or fun fact about your research field?

As a graduate student I worked with volunteers to study pollinator use of home gardens. After our training one of the volunteers discovered a bumblebee in her garden that was thought to be extinct.

What first got you interested in research?

I have always been a curious person who enjoys discovering patterns and connections and diving deep into topics that interest me. Around the age of 10, my teachers nominated me to attend a regional “women in science” day. I was one of just two students who got to go from my school and hear from career scientists. I came home with so much excitement about what felt like the adventure of science.

What’s something people might not expect about your job or daily routine?

I work from home and my two cats (Bubs and Little One), and dog (Mango), are constantly interrupting whatever I am doing with requests to play, eat, go to the bathroom, or sit on my lap.

Outside of work, what do you enjoy doing?

Outside of work I love spending time either at sea level on the coast – tidepooling, beach walking etc. or up high hiking in the alpine zone of the Cascade mountains. I love learning and trying new things, and dabble a number of creative outlets from pottery and gardening to DIY house projects. Lately I have also gotten into weightlifting, and sometimes playing cooperative video games.

What are you favourite citizen science projects?

I first got hooked on Zooniverse through Snapshot Serengeti and AmazonCam Tambopata. Participating in the latter actually inspired a trip to Tambopata with my family in 2017. I also have strong tides to rigorous hands on outdoor projects like the University of Washington’s Coastal Observation and Seabird Survey Team (COASST) and the U.S. National Oceanic and Atmospheric Administration’s Marine Debris Monitoring and Assessment Project (MDMAP).

What guidance would you give to other researchers considering creating a citizen research project?

Don’t assume that your best volunteer audience thinks like or is motivated by the same things as you. Design for your intended data use and commit to a return on volunteers’ investment. Get feedback early and often.

Who’s who in the Zoo – Mengyuan Li

In this edition of Who’s who in the Zoo, meet Mengyuan Li, who is part of the Node Code Breakers team.


Who: Mengyuan Li

Location: King’s College London, UK

Zooniverse project: Node Code Breakers

What is your research about?

My research involves integrative analyses of image and genomic profiling data to investigate metastatic development in lymph nodes.

How do Zooniverse volunteers contribute to your research?

Zooniverse volunteers generated an amazing number of high-quality segmentations which we are working on to train models to assist our pathologists when locating the immune features we’re interested in.

What’s a surprising or fun fact about your research field?

More and more researchers are taking notice of these immune features, especially germinal centres in the lymph nodes.

What first got you interested in research?

The feeling when I solved or explained something with my own research makes me feel good.

What’s something people might not expect about your job or daily routine?

People never expect that there are researchers who don’t go to the lab, but instead sit in front of computers for the entire working day.

Outside of work, what do you enjoy doing?

Finding some interesting things to do and interesting places to go near to London at weekends. Buying beautiful dresses and doing research on makeup when I am on vacation. Video games, manga, cosplay and planning my next trip to Japan, when I have spare time!

What are you favourite citizen science projects?

HMS NHS: The Nautical Health Service

What guidance would you give to other researchers considering creating a citizen research project?

Try to think from the volunteers perspective; what will interest them? It is very helpful to discuss your project with non-experts to improve your project design and the wording for a more general audience.

Is there anything else you’d like to share with our readers?

Scientific research is fun, there are always some interesting shapes or patterns to be discovered!

Why you should use Docker in your research

Last month I gave a talk at the Wetton Workshop in Oxford. Unlike the other talks that week, mine wasn’t about astronomy. I was talking about Docker – a useful tool which has become popular among people who run web services. We use it for practically everything here, and it’s pretty clear that researchers would find it useful if only more of them used it. That’s especially true in fields like astronomy, where a lot of people write their own code to process and analyse their data. If after reading this post you think you’d like to give Docker a try and you’d like some help getting started, just get in touch and I’ll be happy to help.

I’m going to give a brief outline of what Docker is and why it’s useful, but first let’s set the scene. You’re trying to run a script in Python that needs a particular version of NumPy. You install that version but it doesn’t seem to work. Or you already have a different version installed for another project and can’t change it. Or the version it needs is really old and isn’t available to download anymore. You spend hours installing different combinations of packages and eventually you get it working, but you’re not sure exactly what fixed it and you couldn’t repeat the same steps in the future if you wanted to exactly reproduce the environment you’re now working in. 

Many projects require an interconnected web of dependencies, so there are a lot of things that can go wrong when you’re trying to get everything set up. There are a few tools that can help with some of these problems. For Python you can use virtual environments or Anaconda. Some languages install dependencies in the project directory to avoid conflicts, which can cause its own problems. None of that helps when the right versions of packages are simply not available any more, though, and none of those options makes it easy to just download and run your code without a lot of tedious setup. Especially if the person downloading it isn’t already familiar with Python, for example.

If people who download your code today can struggle to get it running, how will it be years from now when the version of NumPy you used isn’t around anymore and the current version is incompatible? That’s if there even is a current version after so many years. Maybe people won’t even be using Python then.

Luckily there is now a solution to all of this, and it’s called software containers. Software containers are a way of packaging applications into their own self-contained environment. Everything you need to run the application is bundled up with the application itself, and it is isolated from the rest of the operating system when it runs. You don’t need to install this and that, upgrade some other thing, check the phase of the moon, and hold your breath to get someone’s code running. You just run one command and whether the application was built with Python, Ruby, Java, or some other thing you’ve never heard of, it will run as expected. No setup required!

Docker is the most well-known way of running containers on your computer. There are other options, such as Kubernetes, but I’m only going to talk about Docker here.

Using containers could seriously improve the reproducibility of your research. If you bundle up your code and data in a Docker image, and publish that image alongside your papers, anyone in the world will be able to re-run your code and get the same results with almost no effort. That includes yourself a few years from now, when you don’t remember how your code works and half of its dependencies aren’t available to install any more.

There is a growing movement for researchers to publish not just their results, but also their raw data and the code they used to process it. Containers are the perfect mechanism for publishing both of those together. A search of arXiv shows there have only been 40 mentions of Docker in papers across all fields in the past year. For comparison there have been 474 papers which mention Python, many of which (possibly most, but I haven’t counted) are presenting scripts and modules created by the authors. That’s without even mentioning other programming languages. This is a missed opportunity, given how much easier it would be to run all this code if the authors provided Docker images. (Some of those authors might provide Docker images without mentioning it in the paper, but that number will be small.)

Docker itself is open source, and all the core file formats and designs are standardised by the Open Container Initiative. Besides Docker, other OCI members include tech giants such as Amazon, Facebook, Microsoft, Google, and lots of others. The technology is designed to be future proof and it isn’t going away, and you won’t be locked into any one vendor’s products by using it. If you package your software in a Docker container you can be reasonably certain it will still run years, or decades, from now. You can install Docker for free by downloading the community edition.

So how might Docker fit into your workday? Your development cycle will probably look something like this: First you’ll probably outline an initial version of the code, and then write a Dockerfile containing the instructions for installing the dependencies and running the code. Then it’s basically the same as what you’d normally do. As you’re working on the code, you’d iterate by building an image and then running that image as a container to test it. (With more advanced usage you can often avoid building a new image every time you run it, by mounting the working directory into the container at runtime.) Once the code is ready you can make it available by publishing the Docker image.

There are three approaches to publishing the image: push the image to the Docker Hub or another Docker registry, publish the Dockerfile along with your code, or export the image as a tar file and upload that somewhere. Obviously these aren’t mutually exclusive. You should do at least the first two, and it’s probably also wise to publish the tar file wherever you’d normally publish your data.

 

The Docker Hub is a free registry for images, so it’s a good place to upload your images so that other Docker users can find them. It’s also where you’ll find a wide selection of ready-built Docker images, both created by the Docker project themselves and created by other users. We at the Zooniverse publish all of the Docker images we use for our own work on the Docker Hub, and it’s an important part of how we manage our web services infrastructure. There are images for many major programming languages and operating system environments.

There are also a few packages which will allow you to run containers in high performance computing environments. Two popular ones are Singularity and Shifter. These will allow you to develop locally using Docker, and then convert your Docker image to run on your HPC cluster. That means the environment it runs in on the cluster will be identical to your development environment, so you won’t run into any surprises when it’s time to run it. Talk to your institution’s IT/HPC people to find out what options are available to you.

Hopefully I’ve made the case for using Docker (or containers in general) for your research. Check out the Docker getting started guide to find out more, and as I said at the beginning, if you’re thinking of using Docker in your research and you want a hand getting started, feel free to get in touch with me and I’ll be happy to help you. 

ZooCon Portsmouth this weekend – remote participation invited!

We’re getting excited in Portsmouth to be welcoming some Zooites to the first ever “ZooCon Portsmouth”, which is happening this Saturday 13th September 2014 (An updated schedule is available on the Eventbrite page for the event).

The theme of this event is a Wiki-a-thon for Citizen Science – we have scheduled a working afternoon and improve the coverage of citizen science on Wikipedia. Mike Peel, Expert Wikimedian and astronomer from the University of Manchester will be joining us to lead this part of the event and get us all up to speed with how editing works.

We invite remote participation of the wiki-a-thon via this discussion thread on Galaxy Zoo Talk, or on Twitter with the hashtag #ZooConPort, and we also plan to livestream the morning talks via Google+.

In person attendees will have a treat in the afternoon – we’re all excited to have Chris Lintott narrate planetarium shows in the Portsmouth Inflatable Astrodome. And we plan to end the day with fish and chips at a pub by the sea. Keep your fingers crossed for nice weather.

Welcome to the Worm Watch Lab

Today we launch a new Zooniverse project in association with the Medical Research Council (MRC) and the Medical Research Foundation: Worm Watch Lab.

We need the public’s help in observing the behaviour of tiny nematode worms. When you classify on wormwatchlab.org you’re shown a video of a worm wriggling around. The aim of the game is to watch and wait for the worm to lay eggs, and to hit the ‘z’ key when they do. It’s very simple and strangely addictive. By watching these worms lay eggs, you’re helping to collect valuable data about genetics that will assist medical research.

Worm Watch Lab

The MRC have built tracking microscopes to record these videos of crawling worms. A USB microscope is mounted on a motorised stage connected to a computer. When the worm moves, the computer analyses the changing image and commands the stage to move to re-centre the worm in the field of view. Because the trackers work without supervision, they can run eight of them in parallel to collect a lot of video! It’s these movies that we need the public to help classify.

By watching movies of the nematode worms, we can understand how the brain works and how genes affect behaviour. The idea is that if a gene is involved in a visible behaviour, then mutations that break that gene might lead to detectable behavioural changes. The type of change gives us a hint about what the affected gene might be doing. Although it is small and has far fewer cells than we do, the worm used in these studies (called C. elegans) has almost as many genes as we do! We share a common ancestor with these worms, so many of their genes are closely related to human genes. This presents us with the opportunity to study the function of genes that are important for human brain function in an animal that is easier to handle, great for microscopy and genetics, and has a generation time of only a few days. It’s all quite amazing!

To get started visit www.wormwatchlab.org and follow the tutorial. You can also find Worm Watch Lab on Facebook and on Twitter.

Got An Idea for a Zooniverse Project? Propose One

For more than a year, we’ve been openly accepting proposals for new Zooniverse projects and this has brought to life projects such as Seafloor Explorer, Snapshot Serengeti, Notes from Nature and Space Warps.

Yesterday, five Zooniverse projects were featured in The Biologist’s 10 Great Citizen Science Projects – several of them were ideas proposed by researchers we had never met before they came to us and said ‘hey, I have a cool idea for a project‘. We’ve also recently seen articles about how the Zooniverse might be able to help in a crisis and how we provide an excellent avenue for proactive procrastination. Citizen science projects are wide and varied and lots of researchers have great ideas.

So this is a good time to remind everyone that we want to hear from researchers with ideas for Zooniverse projects. If that’s you: propose a project! We have funding from the Alfred P. Sloan Foundation to build your great ideas and work with you to further science. We also have an incredibly talented team of designers, developers, educators and researchers who want to make your idea into an awesome new Zooniverse project.

If you want to know more about this, you can get in touch with any of the team or via our general email address or on Twitter @the_zooniverse. We’re currently working on projects that were proposed earlier this year and we’ll be announcing them soon. Maybe yours will be next?