Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding rnabio part 1 lecture slides and readme #56

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions lectures/week_07/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Week 7
## RNAseq Part 1

- [Lecture Recording]()

- [Slides](https://github.com/griffithlab/rnabio.org/blob/master/assets/lectures/cshl/2023/full/RNASeq_Module1_IntrotoRNA.pdf)

- [RNA-seq Bioinformatics](https://rnabio.org/course)

## RNAseq Course Setup
For the BFX Workshop, we will not be using AWS Cloud. Instead, we will use a Docker image created from the AWS AMI used in rnabio.org.

### Docker Setup

A Docker image is available through the DockerHub repository-
[https://hub.docker.com/layers/griffithlab/rnabio/0.0.1/images/sha256-b13f5e9048941c8be3e83555295c0f4ed21645d5fd9bae4226e6bc4f30f54b52?context=explore](https://hub.docker.com/layers/griffithlab/rnabio/0.0.1/images/sha256-b13f5e9048941c8be3e83555295c0f4ed21645d5fd9bae4226e6bc4f30f54b52?context=explore)

1. Ensure that Docker Desktop is running.

2. This command will pull the image `rnabio` to your local Docker client with the tag `0.0.1` from the `griffithlab` DockerHub repository:

```bash
docker pull griffithlab/rnabio:0.0.1
```

3. Setup a local workspace directory for the RNAseq course. If you change the path or command used in this Step, please update the path to the workspace directory accordingly in Step 4. Also, make a file `test_my_docker_mount` that we will look for later.

```bash
mkdir -p bfx-workshop/rnabio-workspace
echo 'this file helps me test my docker mount' >> bfx-workshop/rnabio-workspace/test_my_docker_mount.txt
```

4. Enter the directory where you created the `rnabio-workspace` folder, and initialize a Docker container using the image we pulled above. `-v` tells Docker to mount our workspace directory within the Docker container as `/workspace` with read-write priveleges. You'll see in the RNAseq course `/workspace` is the base directory for nearly all commands and steps.

```bash
cd bfx-workshop/rnabio-workspace
docker run -v $PWD/:/workspace:rw -it griffithlab/rnabio:0.0.1 /bin/bash
```

5. Use `ls` to see what's in your current directory, then enter the `workspace` folder and use `ls` again to see what is in the `workspace` folder.

```bash
ls
cd workspace
ls
```

### User Setup

Now that we are running a Docker container, Docker, by default, will log you in as the "root" user. We need to run as the ubuntu user to match the RNAseq course tutorials.

1. Switch User `su` to the ubuntu user:

```bash
su ubuntu
```

2. Source the pre-installed `.bashrc` file to configure your environment to match the RNAseq course:

```bash
source ~/.bashrc
```

NOTE: Using Docker and the persistent "workspace" volume we attached will allow you to start/stop as you wish. EVERY TIME YOU LOGIN TO THE DOCKER CONTAINER, YOU MUST LOGIN AS THE `ubuntu` USER *AND* `source ~/.bashrc` UPON EACH LOGIN.

### Environment Setup

Create a working directory and set the ‘RNA_HOME’ environment variable
```
mkdir -p ~/workspace/rnaseq/
export RNA_HOME=~/workspace/rnaseq
```

Make sure whatever the working dir is, that it is set and is valid
```
echo $RNA_HOME
```

Since all the environment variables we set up for the RNA-seq workshop start with ‘RNA’ we can easily view them all by combined use of the env and grep commands as shown below. The env command shows all environment variables currently defined and the grep command identifies string matches.
```
env | grep RNA
```

In order to view the contents of this file, you can type:
```
less ~/.bashrc
```
To exit `less`, type `q`.

### Known Issues/ Discrepancies from RNAbio website
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, we should consider forking the RNAbio content and just incorporating these few changes. It could be a "local version" of RNAbio that would be useful for lots of folks that can't spin up an AMI. If you don't have time this year, that's fine, but something to think about for the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get around to seeing these comments until now, but happy to work on this!

For the local version part, would that be similar to how we use the griffithlab/rnabiodev:0.0.3 docker when we're testing updates to the website?

1. When running the check strandedness tool in the Module 1, RNAseq Data section, the docker run command cannot be run from within your `griffithlab/rnabio:0.0.1` docker session. To run it, we suggest that you open a new terminal window, `cd` into the `rnaseq` directory you created at the beginning of this assignment, and use the following command instead:
```
docker run -v $PWD/:/docker_workspace mgibio/checkstrandedness:latest check_strandedness --gtf /docker_workspace/refs/chr22_with_ERCC92_tidy.gtf --transcripts /docker_workspace/refs/chr22_ERCC92_transcripts.clean.fa --reads_1 /docker_workspace/data/HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read1.fastq.gz --reads_2 /docker_workspace/data/HBR_Rep1_ERCC-Mix2_Build37-ErccTranscripts-chr22.read2.fastq.gz
```
This is the same command as what is mentioned in the course webpage, except that instead of mounting (`-v` flag) `/home/ubuntu/workspace/rnaseq` to the docker image- which is where the data was stored for students running through the course on an AMI, you will instead mount whatever your current directory is. Also, this is different from an interactive session where we are able to enter the docker and run commands within it. Instead we are executing our command directly all in that one line of code.

2. In various parts of RNAbio, in order to view HTML files, plots, etc., the tutorial suggests going to a public IPV4 address link in your browser window. That is only needed for the AMI. Since you'll be running everything locally, you can either find the files in your Finder window or File Explorer and open them directly; or even better, use `open [your_file.html]` on Mac and `explorer.exe [your_file.html]` on Windows/WSL2 to open the file in your default browser!

3. In Pre-alignment QC, an optional QC analysis is running fastp. This software is not available in your docker, so please skip it (the fastqc and multiqc analysis should still work and can be used for analysis).
Similarly, you can also skip the adapter trim step as the data provided here does not actually need to be adapter trimmed (however the code is available if you need to do it for your own data)

4. `geneBody_coverage.py` in the optional RSeQC section is not correctly in the `PATH`. Use the full path to the python script `/home/ubuntu/.local/bin/geneBody_coverage.py`


## Homework Assignments
1. Finish Module 1 - [Inputs](https://rnabio.org/module-01-inputs/0001/01/01/Intro_to_Inputs/)
2. Complete Module 2 - [Alignment](https://rnabio.org/module-02-alignment/0002/01/01/Intro_to_Alignment/)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have students taking the course for credit and while we're not grading a bunch of questions, we've been asking for some kind of screenshot or answer that proves they made it to the end of the homework. Any suggestions for that?

For credit students:

Turn in XXXXXX to Jenny as proof of completion

Copy link
Contributor Author

@ksinghal28 ksinghal28 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe an IGV screenshot with the merged UHR.bam and HBR.bam?
There's this question towards the end of the alignment visualization section-
What are the options for visualizing splicing or alternative splicing patterns in IGV? Navigate to this location on chromosome 22: ‘chr22:40,363,200-40,367,500’. What splicing event do you see?
An IGV screenshot of that could be good (it should look something like the attached screenshot)
image
(the text answer to the question is already in rnabio, but the screenshot isn't)

Or we could have them count the number of reads in the merged HBR.bam and UHR.bam files and confirm that they are the sum of the files that were merged.
HBR.bam = 793963
UHR.bam = 1174955

Binary file added lectures/week_07/RNASeq_Module1_IntrotoRNA.pdf
Binary file not shown.