Tag Archives: client

Panoptes CLI 1.1 now available

I recently released version 1.1 of the Panoptes CLI – the command-line interface for managing Zooniverse projects. This update includes some exciting new features. Here are the highlights.

You can install the update by running pip install -U panoptescli. Any bugs or issues should be raised via GitHub. See the changelog for the full list of changes.

Resuming failed subject uploads

This one adds what is probably the CLI’s most requested feature: the ability to resume a failed upload from where it left off, without duplicating subjects or requiring manual changes to the manifest. I hope this will be a huge help to researchers, especially when uploading large manifests.

If the upload fails for any reason – whether that’s an issue with our systems, a problem with your internet connection, a bug in the CLI itself, or if you just decide to stop the upload by pressing ctl-c – the CLI will detect that there was a problem and will ask you if you want to be able to resume the upload later. If you say yes, it will then save a new manifest in YAML format containing the remaining upload queue along with all of the upload’s command line options. Then to resume, you just start a new upload with the YAML manifest instead of the original CSV.

Multithreaded subject uploads

Uploading new subjects can often take a long time. The total upload time depends not only on your internet connection speed, but also on the time it takes for the CLI to talk to the Panoptes API. Creating a new subject typically requires the CLI to make two HTTP requests: one to create the subject and one to upload the subject’s media (the image, or video, or whatever). If the subject has multiple images then that only increases the number of requests. Plus subjects need to be linked to the subject set; this happens in batches, but it can still add up to a lot of requests for large uploads. If you’re uploading 10,000 subjects for example, that means the CLI has to make a minimum of 20,000 requests (probably more), and each of those requests includes some overhead where the CLI is waiting for the server to respond, which is all basically wasted time.

Luckily the Panoptes CLI 1.1 gets around that, by taking advantage of the multithreading features of the Panoptes Client for Python which were released earlier this year. Now, those 20,000 requests will happen five at a time, so for example three of them can be sending data while two of them are waiting for the server, meaning your internet connection is fully utilised the whole time and no time is wasted. In my testing, this substantially sped up subject uploads, potentially saving hours of your time.

Adding and removing lists of subjects to and from subject sets

Often project owners need to add large numbers of existing subjects to a new set, or remove subjects from their current set. It was possible to do this with the previous version of the CLI by passing subject IDs on the command-line, but it was often difficult to modify large numbers of subjects this way (it was possible with xargs on Linux and macOS, but this isn’t the most intuitive way to do it).

Now, there’s a new option to pass a list of IDs in a text file rather than having to specify IDs on the command-line. (The old way is still there too if you prefer to do it that way!) Just produce a text file containing the relevant subject IDs, one per line. If you already have the subject information in a spreadsheet, exporting a CSV file with just the subject ID column will produce the right file (just make sure it only contains the one column).

For example, if you have a file called subject_ids.csv containing the following:

1234
5678
9012

You can run:

panoptes subject-set add-subjects -f subject_ids.csv 1357

to add subjects 1234, 5678, and 9012 to subject set 1357.

 

Edited 29 November 2019: Fixed typo in pip command for upgrade.

Panoptes Client for Python 1.1

I’ve just released version 1.1 of the Panoptes Client for Python. The changelog has a full list of what’s new, but there are a few things I wanted to highlight, the first two of which will make it substantially faster to create new subjects:

  • Multithreaded media uploads – the client will automatically use several threads to upload media when you first save a new subject. So, for example, if you create a subject which has three images they will all upload simultaneously (up to five simultaneous uploads, then it will queue them).
  • Multithreaded subject creation – you can also simultaneously create the subjects themselves. That means if you’re creating, say, a thousand subjects, the client can queue them all and create up to five of them simultaneously. This works in conjunction with the media uploads, using one combined queue for the subject creation and the media uploads, to avoid overloading the network and to make sure the subject creation doesn’t get too far ahead of the uploads. This one isn’t automatic – you’ll need to create your subjects with the new SubjectSet.async_saves() context manager to take advantage of it.
  • Retries for all GET requests – we’re quite proud of how reliable the Zooniverse platform is, but sometimes server-side errors do happen. The client will now automatically retry all GET requests (i.e. the ones that don’t modify any data) if an error occurs, improving reliability.
  • Retries for batch linking operations – similar to above, the client will retry any add/remove operations via the new LinkCollection class, which handles linking groups of objects (i.e. subjects to a subject set, subjects to a collection, etc.). This means you should see far fewer failures when linking thousands of subjects to a subject set, for example.
  • Context manager for multiple connections – the Panoptes class can now act as a context manager, providing a safe way to perform operations as multiple users (for example, in a web app).

You can install the update by running pip install -U panoptes-client. Any bugs or issues should be raised via GitHub.

Panoptes Client for Python 1.0.3

Hot on the heels of last week’s update, I’ve just released version 1.0.3 of the Python Panoptes Client, which fixes a bug introduced in the previous release. If you encounter a TypeError when you try to create subjects, please update to this new version and that should fix it.

This release also updates the default client ID that is used to identify the client to the Panoptes API. This is to ensure that each of our API clients is using a unique ID.

As before, you can install the update by running pip install -U panoptes-client.

Panoptes CLI 1.0.1 and Panoptes Client for Python 1.0.1

We’ve recently released updates for the Panoptes command-line interface and the Panoptes Client module for Python containing a few bug fixes.

From the changelog for Panoptes Client:

  • Fix: Exports are not automatically decompressed on download
  • Fix: Unable to save a Workflow
  • Fix: Fix typo in documentation for Classification
  • Fix: Fix saving objects initialised from object links

And from the CLI:

  • Fix: Modifying projects makes them private

You can install the updates by running pip install -U panoptescli and pip install -U panoptes-client.

Panoptes CLI 1.0, a command-line interface for managing projects

Following on from the release of Panoptes Client 1.0 for Python, we’ve just released version 1.0 of the Panoptes CLI. This is a command-line client for managing your projects, because some things are just easier in a terminal! The CLI lets you do common project management tasks, such as activating workflows, linking subject sets, downloading data exports, and uploading subjects. Let’s jump in with a few examples.

First, downloading a classification export (obviously you’d insert your own project ID and a filename of your choice):

panoptes project download 764 Downloads/pulsar-hunters-classifications.csv

cli-classification-download.gif

This command will optionally generate a new export and wait for it to be ready before downloading. No more waiting for the notification email!

New subjects can be uploaded to a new subject set like so (again, inserting your own IDs):

panoptes subject-set create 7 "November 2017 subjects"
panoptes subject-set upload-subjects 16401 manifest.csv

cli-subject-upload.gif

You can also pipe the output from the CLI into other standard commands to do more powerful things, such as linking every subject set in your project to a workflow using the xargs command (where 1234 and 5678 are your project ID and workflow ID respectively):

panoptes subject-set ls -q -p 1234 | xargs panoptes workflow add-suject-sets 5678

Visit GitHub to get started with the CLI today!

Introducing Panoptes Client 1.0 for Python

I’m happy to announce that the Panoptes Client package for Python has finally reached version 1.0, after nearly a year and a half of development. With this package, you can automate the management of your projects, including uploading subjects, managing subject sets, and downloading data exports.

There’s still more work to do – I have lots of additional features and improvements planned for version 1.1 – but with the release of version 1.0, the Client has a stable set of core features which are useful for managing projects (both large and small).

I know a lot of people have already been using the 0.x versions while we’ve been working on them, so thanks to everyone who submitted feature requests, bug reports, and pull requests on GitHub. Please do upgrade to the latest version to make sure you have the latest bug fixes, and keep the requests and bug reports coming!

You can find installation and upgrade instructions on GitHub, and full documentation on Read the Docs.