Tag Archives: science

Bursts from Space

This is a guest post by summer intern Anastasia Unitt.

The study of celestial objects creates a huge amount of data. So much data, that astronomers struggle to make use of it all. The solution? Citizen scientists, who lend their brainpower to analyse and catalogue vast swathes of information. Alex Andersson, a DPhil student at the University of Oxford, has been applying this approach to his field: radio astronomy, through the Zooniverse. I met with him via Zoom to learn about his project detecting rare, potentially explosive events happening far out in space.

Alex’s research uses data collected by a radio telescope located thousands of miles away in South Africa, named MeerKAT. The enormous dishes of the telescope detect radio waves, captured from patches of sky about twice the size of the full Moon. This data is then converted into images, which show the source of the waves, and into light curves, a kind of scatter plot which depicts how the brightness of these objects has changed over time. This information was initially collected for a different project, so Alex is exploiting the remaining information in the background- or, as he calls it: “squeezing science out of the rest of the picture.” The goal: to identify transient sources in the images, things that are changing, disappearing and appearing.

Historically, relatively few of these transients have been identified, but the many extra pairs of eyes contributed by citizen scientists has changed the game. The volume of data analysed can be much larger, the process far faster. Alex is clearly both proud of and extremely grateful to his flock of amateur astronomers. “My scientists are able to find things that using traditional methods we just wouldn’t have been able to find, [things] we would have missed.” The project is ongoing, but his favourite finding so far took the form of a “blip” his citizen scientists noticed in just two of the images (out of thousands). Alex explains: “We followed it up and it turns out it’s this star that’s 10 times further away than our nearest stellar neighbor, and it’s flaring. No one’s ever seen it with a radio telescope before.” His excitement is obvious, and justified. This is just one of many findings that may be previously unidentified stars, or even other kinds of celestial objects such as black holes. There’s still so much to find out, the possibilities are almost endless.

A range of light curve shapes spotted by Zooniverse citizen scientists performing classifications for Bursts from Space: MeerKAT

Unfortunately, research comes with its fair share of frustrating moments along with the successes. For Alex, it’s the process of preparing the data for analysis which has proved the most irksome. “Sometimes there’s bits in the process that take a long time, particularly messing with code. There can be so much effort that went into this one little bit, that even if you did put it in a paper is only one sentence.” These behind-the-scenes struggles are essential to make the data presentable to the citizen scientists in the first place, as well as to deal with the thousands of responses which come out the other side. He assures me it’s all worth it in the end.

As to where this research is headed next, Alex says the prospects are very exciting. Now they have a large bank of images that have been analysed by the citizen scientists, he can apply this information to train machine learning algorithms to perform similar detection of interesting transient sources. This next step will allow him to see “how we can harness these new techniques to apply them to radio astronomy – which again, is a completely novel thing.”

Alex is clearly looking forward to these further leaps into the unknown. “The PhD has been a real journey into lots of things that I don’t know, which is exciting. That’s really fun in and of itself.” However, when I ask him what his favourite part of this research has been so far, it isn’t the science. It’s the citizen scientists. He interacts with them directly through chat boards on the Zooniverse site, discussing findings and answering questions. Alex describes their enthusiasm as infectious – “We’re all excited about this unknown frontier together, and that has been really, really lovely.” He’s already busy preparing more data for the volunteers to examine, and who knows what they might find; they still have plenty of sky to explore.

28 New Planet Candidates Discovered on Exoplanet Explorers

The team behind the Exoplanet Explorers project has just published a Research Note of the American Astronomical Society announcing the discovery of 28 new exoplanet candidates uncovered by Zooniverse volunteers taking part in the project.

Nine of these candidates are most likely rocky planets, with the rest being gaseous. The sizes of these potential exoplanets range from two thirds the size of Earth to twice the size of Neptune!

This figure shows the transit dips for all 28 exoplanet candidates. Zink et al., 2019

You can find out more about these exoplanet candidates in the actual research note at https://iopscience.iop.org/article/10.3847/2515-5172/ab0a02, and in this blog post by the Exoplanet Explorers research team http://www.jonzink.com/blogEE.html.

Finally, both the Exoplanet Explorers and Zooniverse teams would like to extend their deep gratitude to all the volunteers who took part in the project and made these amazing discoveries possible.

Exoplanet Explorers Discoveries – A Small Planet in the Habitable Zone

This post is by Adina Feinstein. Adina is a graduate student at the University of Chicago. Her work focuses on detecting and characterizing exoplanets. Adina became involved with the Exoplanet Explorers project through her mentor, Joshua Schlieder, at NASA Goddard through their summer research program.

Let me tell you about the newly discovered system – K2-288 – uncovered by volunteers on Exoplanet Explorers.

K2-288 has two low-mass M dwarf stars: a primary (K2-288A) which is roughly half the size of the Sun and a secondary (K2-288B) which is roughly one-third the size of the Sun. The capital lettering denotes a star in the planet-naming world. Already this system is shaping up to be pretty cool. The one planet in this system, K2-288Bb, hosts the smaller, secondary star. K2-288Bb orbits on a 31.3 day period, which isn’t very long compared to Earth, but this period places the planet in the habitable zone of its host star. The habitable zone is defined as the region where liquid water could exist on the planet’s surface. K2-288Bb has an equilibrium temperature -47°C, colder than the equilibrium temperature of Earth. It is approximately 1.9 times the radius of Earth, which places it in a region of planet radius space where we believe planets transition to volatile-rich sub-Neptunes, rather than being potentially habitable super-Earth. Planets of this size are rare, with only about a handful known to-date.

Artist’s rendering of the K2-288 system.

The story of the discovery of this system is an interesting one. When two of the reaction wheels on the Kepler spacecraft failed, the mission team re-oriented the spacecraft to allow observations to continue to happen. The re-orientation caused slight variations in the shape of the telescope and temperature of the instruments on board. As a consequence, the beginning of each observing campaign experienced extreme systematic errors and initially, when searching for exoplanet transits, we “threw out” or ignored the first days of observing. Then, when we were searching the data by-eye for new planet candidates, we came across this system and only saw 2 transits. In order for follow-up observations to proceed, we need a minimum of 3 transits, so we put this system on the back-burner. The light curve (the amount of light we see from a star over time) with the transits is shown below.

Later, we learned how to model and correct for the systematic errors at the beginning of each observing run and re-processed all of the data. Instead of searching it all by-eye again, as we had done initially, we outsourced it to Exoplanet Explorers and citizen scientists, who identified this system with three transit signals. The volunteers started a discussion thread about this planet because given initial stellar parameters, this planet would be around the same size and temperature as Earth. This caught our attention. As it turns out, there was an additional transit at the beginning of the observing run that we missed when we threw out this data! Makennah Bristow, a fellow intern of mine at NASA Goddard, identified the system again independently. With now three transits and a relatively long orbital period of 31.3 days, we pushed to begin the observational follow-up needed to confirm this planet was real.

First, we obtained spectra, or a unique chemical fingerprint of the star. This allowed us to place better constraints on the parameters of the star, such as mass, radius, temperature, and brightness. While obtaining spectra from the Keck Observatory, we noticed a potential companion star. We conducted adaptive optics observations to see if the companion was bound to the star or a background source. Most stars in the Milky Way are born in pairs, so it was not too surprising that this system was no different. After identifying a fainter companion, we made extra sure the signal was due to a real planet and not the companion; we convinced ourselves this was the case.

Finally, we had to determine which star the planet was orbiting. We obtained an additional transit using the Spitzer spacecraft. Using both the Kepler and Spitzer transits, we derived planet parameters for both when the planet orbits the primary and the secondary. The planet radius derived from both light curves was most consistent when the host star was the secondary. Additionally, we derived the stellar density from the observed planet transit and this better correlated to the smaller secondary star. To round it all off, we calculated the probability of the signal being a false positive (i.e. not a planet signal) when the planet orbits the secondary and it resulted in a false positive probability of roughly 10e-9, which indicates it most likely is a real signal.

The role of citizen scientists in this discovery was critical, which is why some of the key Zooniverse volunteers are included as co-authors on this publication. K2-288 was observed in K2 Campaign 4, which ran from April to September back in 2015. We scientists initially missed this system and it’s likely that even though we learned how to better model and remove spacecraft systematics, it would have taken years for us to go back into older data and find this system. Citizen scientists have shown us that even though there is so much new data coming out, especially with the launch of the Transiting Exoplanet Survey Satellite, the older data is still a treasure trove of new discoveries. Thank you to all of the Exoplanet volunteers who made this discovery possible and continue your great work!

The paper written by the team is available here. It should be open to all very shortly.

Zooniverse Workflow Bug

We recently uncovered a couple of bugs in the Zooniverse code which meant that the wrong question text may have been shown to some volunteers on Zooniverse projects while they were classifying. They were caught and a fix was released the same day on 29th November 2018.

The bugs only affected some projects with multiple live workflows from 6th-12th and 20th-29th November.

One of the bugs was difficult to recreate and relied on a complex timing of events, therefore we think it was rare and probably did not affect a significant fraction of classifications, so it hopefully will not have caused major issues with the general consensus on the data. However, it is not possible for us to say exactly which classifications were affected in the timeframe the bug was active.

We have apologised to the relevant science teams for the issues this may cause with their data analysis, but we would also like to extend our apologies to all volunteers who have taken part in these projects during the time the bugs were in effect. It is of the utmost importance to us that no effort is wasted on our projects and when something like this happens it is taken very seriously by the Zooniverse team. Since we discovered these bugs we worked tirelessly to fix them, and we have taken actions to make sure nothing like this will happen in the future.

We hope that you accept our most sincere apologies and continue the amazing work you do on the Zooniverse. If you have any questions please don’t hesitate to contact us at contact@zooniverse.org.

Sincerely,

The Zooniverse Team

Zooniverse Data Aggregation

Hi all, I am Coleman Krawczyk and for the past year I have been working on tools to help Zooniverse research teams work with their data exports.  The current version of the code (v1.3.0) supports data aggregation for nearly all the project builder task types, and support will be added for the remaining task types in the coming months.

What does this code do?

This code provides tools to allow research teams to process and aggregate classifications made on their project, or in other words, this code calculates the consensus answer for a given subject based on the volunteer classifications.  

The code is written in python, but it can be run completely using three command line scripts (no python knowledge needed) and a project’s data exports.

Configuration

The first script is the uses a project’s workflow data export to auto-configure what extractors and reducers (see below) should be run for each task in the workflow.  This produces a series of `yaml` configuration files with reasonable default values selected.

Extraction

Next the extraction script takes the classification data export and flattens it into a series of `csv` files, one for each unique task type, that only contain the data needed for the reduction process.  Although the code tries its best to produce completely “flat” data tables, this is not always possible, so more complex tasks (e.g. drawing tasks) have structured data for some columns.

Reduction

The final script takes the results of the data extraction and combine them into a single consensus result for each subject and each task (e.g. vote counts, clustered shapes, etc…).  For more complex tasks (e.g. drawing tasks) the reducer’s configuration file accepts parameters to help tune the aggregation algorithms to best work with the data at hand.

A full example using these scripts can be found in the documentation.

Future for this code

At the moment this code is provided in its “offline” form, but we testing ways for this aggregation to be run “live” on a Zooniverse project.  When that system is finished a research team will be able to enter their configuration parameters directly in the project builder, a server will run the aggregation code, and the extracted or reduced `csv` files will be made available for download.

Why you should use Docker in your research

Last month I gave a talk at the Wetton Workshop in Oxford. Unlike the other talks that week, mine wasn’t about astronomy. I was talking about Docker – a useful tool which has become popular among people who run web services. We use it for practically everything here, and it’s pretty clear that researchers would find it useful if only more of them used it. That’s especially true in fields like astronomy, where a lot of people write their own code to process and analyse their data. If after reading this post you think you’d like to give Docker a try and you’d like some help getting started, just get in touch and I’ll be happy to help.

I’m going to give a brief outline of what Docker is and why it’s useful, but first let’s set the scene. You’re trying to run a script in Python that needs a particular version of NumPy. You install that version but it doesn’t seem to work. Or you already have a different version installed for another project and can’t change it. Or the version it needs is really old and isn’t available to download anymore. You spend hours installing different combinations of packages and eventually you get it working, but you’re not sure exactly what fixed it and you couldn’t repeat the same steps in the future if you wanted to exactly reproduce the environment you’re now working in. 

Many projects require an interconnected web of dependencies, so there are a lot of things that can go wrong when you’re trying to get everything set up. There are a few tools that can help with some of these problems. For Python you can use virtual environments or Anaconda. Some languages install dependencies in the project directory to avoid conflicts, which can cause its own problems. None of that helps when the right versions of packages are simply not available any more, though, and none of those options makes it easy to just download and run your code without a lot of tedious setup. Especially if the person downloading it isn’t already familiar with Python, for example.

If people who download your code today can struggle to get it running, how will it be years from now when the version of NumPy you used isn’t around anymore and the current version is incompatible? That’s if there even is a current version after so many years. Maybe people won’t even be using Python then.

Luckily there is now a solution to all of this, and it’s called software containers. Software containers are a way of packaging applications into their own self-contained environment. Everything you need to run the application is bundled up with the application itself, and it is isolated from the rest of the operating system when it runs. You don’t need to install this and that, upgrade some other thing, check the phase of the moon, and hold your breath to get someone’s code running. You just run one command and whether the application was built with Python, Ruby, Java, or some other thing you’ve never heard of, it will run as expected. No setup required!

Docker is the most well-known way of running containers on your computer. There are other options, such as Kubernetes, but I’m only going to talk about Docker here.

Using containers could seriously improve the reproducibility of your research. If you bundle up your code and data in a Docker image, and publish that image alongside your papers, anyone in the world will be able to re-run your code and get the same results with almost no effort. That includes yourself a few years from now, when you don’t remember how your code works and half of its dependencies aren’t available to install any more.

There is a growing movement for researchers to publish not just their results, but also their raw data and the code they used to process it. Containers are the perfect mechanism for publishing both of those together. A search of arXiv shows there have only been 40 mentions of Docker in papers across all fields in the past year. For comparison there have been 474 papers which mention Python, many of which (possibly most, but I haven’t counted) are presenting scripts and modules created by the authors. That’s without even mentioning other programming languages. This is a missed opportunity, given how much easier it would be to run all this code if the authors provided Docker images. (Some of those authors might provide Docker images without mentioning it in the paper, but that number will be small.)

Docker itself is open source, and all the core file formats and designs are standardised by the Open Container Initiative. Besides Docker, other OCI members include tech giants such as Amazon, Facebook, Microsoft, Google, and lots of others. The technology is designed to be future proof and it isn’t going away, and you won’t be locked into any one vendor’s products by using it. If you package your software in a Docker container you can be reasonably certain it will still run years, or decades, from now. You can install Docker for free by downloading the community edition.

So how might Docker fit into your workday? Your development cycle will probably look something like this: First you’ll probably outline an initial version of the code, and then write a Dockerfile containing the instructions for installing the dependencies and running the code. Then it’s basically the same as what you’d normally do. As you’re working on the code, you’d iterate by building an image and then running that image as a container to test it. (With more advanced usage you can often avoid building a new image every time you run it, by mounting the working directory into the container at runtime.) Once the code is ready you can make it available by publishing the Docker image.

There are three approaches to publishing the image: push the image to the Docker Hub or another Docker registry, publish the Dockerfile along with your code, or export the image as a tar file and upload that somewhere. Obviously these aren’t mutually exclusive. You should do at least the first two, and it’s probably also wise to publish the tar file wherever you’d normally publish your data.

 

The Docker Hub is a free registry for images, so it’s a good place to upload your images so that other Docker users can find them. It’s also where you’ll find a wide selection of ready-built Docker images, both created by the Docker project themselves and created by other users. We at the Zooniverse publish all of the Docker images we use for our own work on the Docker Hub, and it’s an important part of how we manage our web services infrastructure. There are images for many major programming languages and operating system environments.

There are also a few packages which will allow you to run containers in high performance computing environments. Two popular ones are Singularity and Shifter. These will allow you to develop locally using Docker, and then convert your Docker image to run on your HPC cluster. That means the environment it runs in on the cluster will be identical to your development environment, so you won’t run into any surprises when it’s time to run it. Talk to your institution’s IT/HPC people to find out what options are available to you.

Hopefully I’ve made the case for using Docker (or containers in general) for your research. Check out the Docker getting started guide to find out more, and as I said at the beginning, if you’re thinking of using Docker in your research and you want a hand getting started, feel free to get in touch with me and I’ll be happy to help you. 

Asteroid Zoo Paused

The AsteroidZoo community has exhausted the data that are available at this time. With all the data examined we are going to pause the experiment, and before users spend more time we want to make sure that we can process your finds through the Minor Planet Center and get highly reliable results.

We understand that it’s frustrating when you’ve put in a lot of work, and there isn’t a way to confirm how well you’ve done. But please keep in mind that this was an experiment – How well do humans find asteroids that machines cannot?

Often times in science an experiment can run into dead-ends, or speed-bumps; this is just the nature of science. There is no question that the AsteroidZoo community has found several potential asteroid candidates that machines and algorithms simply missed. However, the conversion of these tantalizing candidates into valid results has encountered a speed bump.

What’s been difficult is that all the processing to make an asteroid find “real” has been based on the precision of a machine – for example the arc of an asteroid must be the correct shape to a tiny fraction of a pixel to be accepted as a good measurement. The usual process of achieving such great precision is hands-on, and might take takes several humans weeks to get right. On AsteroidZoo, given the large scale of the data, automating the process of going from clicks to precise trajectories has been the challenge.

While we are paused, there will be updates to both the analysis process, and the process of confirming results with the Minor Planet Center. Updates will be posted as they become available.

https://talk.asteroidzoo.org/
http://reporting.asteroidzoo.org/

Thank you for your time.

ZooCon Portsmouth this weekend – remote participation invited!

We’re getting excited in Portsmouth to be welcoming some Zooites to the first ever “ZooCon Portsmouth”, which is happening this Saturday 13th September 2014 (An updated schedule is available on the Eventbrite page for the event).

The theme of this event is a Wiki-a-thon for Citizen Science – we have scheduled a working afternoon and improve the coverage of citizen science on Wikipedia. Mike Peel, Expert Wikimedian and astronomer from the University of Manchester will be joining us to lead this part of the event and get us all up to speed with how editing works.

We invite remote participation of the wiki-a-thon via this discussion thread on Galaxy Zoo Talk, or on Twitter with the hashtag #ZooConPort, and we also plan to livestream the morning talks via Google+.

In person attendees will have a treat in the afternoon – we’re all excited to have Chris Lintott narrate planetarium shows in the Portsmouth Inflatable Astrodome. And we plan to end the day with fish and chips at a pub by the sea. Keep your fingers crossed for nice weather.

Call for Proposals

conscicom

 

The Constructing Scientific Communities project (ConSciCom), part of the AHRC’s ‘Science in Culture’ theme, is inviting proposals for citizen science or citizen humanities projects to be developed as part of the Zooniverse platform.

ConSciCom examines citizen science in the 19th and 21st centuries, contrasting and reflecting on engagement with distributed communities of amateur researchers in both the historical record and in contemporary practice.

Between one and four successful projects will be selected from responses to this call, and will be developed and hosted by the Zooniverse in association with the applications. We hope to include both scientific and historical projects; those writing proposals should review the existing range of Zooniverse projects which include not only classification, but also transcription projects. Please note, however, ConSciCom cannot distribute funds nor support imaging or other digitization in support of the project.

Projects will be selected according to the following criteria:

  1. Merit and usefulness of the data expected to result from the project.
  2. Novelty of the problem; projects which require extending the capability of the Zooniverse platform or serve as case studies for crowdsourcing in new areas or in new ways are welcome.
  3. Alignment with the goals and interests of the Constructing Scientific Communities project. In particular, we wish to encourage projects that:
    1. Have a significant historical dimension, especially in relation to the history of science.
    2. Involve the transcription of text, either in its entirety or for rich metadata.

Note it is anticipated that some, but not necessarily all selected projects, will meet this third criterion; please do submit proposals on other topics.

The deadline for submissions is July 25th 2014. You can submit a proposal by following this link http://conscicom.org/proposals/form/

 

 

 

New Project: Plankton Portal

It’s always great to launch a new project! Plankton Portal allows you to explore the open ocean from the comfort of your own home. You can dive hundreds of feet deep, and observe the unperturbed ocean and the myriad animals that inhabit the earth’s last frontier.

Plankton Portal Screenshot

The goal of the site is to classify underwater images in order to study plankton. We’ve teamed up with researchers at the University of Miami and Oregon State University who want to understand the distribution and behaviour of plankton in the open ocean.

The site shows you one of millions of plankton images taken by the In Situ Ichthyoplankton Imaging System (ISIIS), a unique underwater robot engineered at the University of Miami. ISIIS operates as an ocean scanner that casts the shadow of tiny and transparent oceanic creatures onto a very high resolution digital sensor at very high frequency. So far, ISIIS has been used in several oceans around the world to detect the presence of larval fish, small crustaceans and jellyfish in ways never before possible. This new technology can help answer important questions ranging from how do plankton disperse, interact and survive in the marine environment, to predicting the physical and biological factors could influence the plankton community.

The dataset used for Plankton Portal comes a period of just three days in Fall 2010. In three days, they collected so much data that would take more than three years to analyze it themselves. That’s why they need your help! A computer will probably be able to tell the difference between major classes of organisms, such as a shrimp versus a jellyfish, but to distinguish different species within an order or family, that is still best done by the human eye.

If you want to help, you can visit http://www.planktonportal.org. A field guide is provided, and there is a simple tutorial. The science team will be on Plankton Portal Talk to answer any questions, and the project is also on Twitter, Facebook and Google+.