Skip to content

Runbook: Preparing for Workshops

Anugerah Erlaut edited this page Oct 24, 2022 · 3 revisions

There are some preparations that we have to do before a workshop:

  1. Create user accounts
  2. Upload workshop experiments
  3. Processing workshop data

1. Creating user accounts

Create user accounts using the biomage account create-user-list utility. To do that :

  1. Prepare the CSV containing the list of workshop users. Usually this comes from a Google docs registraion.
  2. Create a new CSV file containng the full name of the users in the first column, and email in the 2nd column:

Arthur Dent,[email protected]

  1. Run biomage account create-user-list -i production --user_list <path/to/input.csv>. The command will output another file in the same directory as input.csvnamedinput.csv.out`. This file contains the password that's used for the user accounts in the last column.
  2. Copy the password back into the source (e.g. Google Docs), taking care that the users and passwords match up. Users with existing accounts will contain "Already have an account" in the password column.

2. Uploading workshop experiments

With the user accounts created, the workshop experiment can be uploaded to the user accounts. This is done using a script inside the private internal repo

  1. Clone and setup the repo
  2. Open upload_workshop_experiment.ipynb
  3. Make sure you can run the notebook. Run the notebook until the block before the last.
  4. Open up a tunnel to the production database by running biomage rds tunnel -i production.
  5. Modify variable infilename to point to the path to the output file that was created in step 3 of 1. Creating user accounts
  6. Run the last block. This will create a new output CSV file named, containing the path to the uploaded experiments.
  7. Copy the experiments back to the source document (e.g. Google Docs).

3. Running and checking workshop data

Running workshop data is done a few days before the workshop. This is done to cache the pipeline and work results, reducing the load caused by the workshop on the platform. Right now, there is no automatic way to do this, so running the workshop is done manually. To process and check an experiment:

  1. Open the source document (e.g. Google Docs) containing the workshop user data and experiment link
  2. Open scp.biomage.net and log in as admin
  3. Search for the experiment using the search box and run the experiment. Do not forget to toggle the email notification, so users will not get notified when the pipeline is done.
  4. Once the pipeline has finished running, go to the last step in Data Processing (i.e. Configure Embedding) to run the worker.
  5. Once the embedding has loaded, check that there are 14 clusters, starting from cluster 0 - cluster 13 (as of 24 Oct 2022). Check that the embeddings between experiments are the same. If the number of clusters is not 14, report this in Slack.

To save time, the steps above can be done in parallel (i.e. running multiple experiments at once). To do that, you can scale up the pipeline and worker pods in production. However, do not forget to scale down the pods back. The default number of pods is 3 for pipeline and 2 for workers (as of 24 Oct 2022).