RNASeq - exercise 1

Set working directory

# Set working directory
setwd("/Volumes/bioinfomatics$/jurtasun/Courses/CBW2022/LMS_RNASeq/course/exercises")
getwd()

## [1] "/Volumes/bioinfomatics$/jurtasun/Courses/CBW2022/LMS_RNASeq/course/exercises"

Generate fake data


# Normalize read counts for remove bias related to:
# i) Sequence depth - sequencing runs with higher depths will have more reads mapping to each gene ("per million")
# ii) Gene length - longer genes (measured in kilobases of exons) will have more reads mapping to them

# Generate some fake data - 3 replicates and 4 genes, following the example:

Perform RPKM normalization


# Normalization method 1: RPKM (Reads Per Kilobase of exon and per Million reads)

# i) Normalize by read depth (total number per replica (column))

# ii) Normalize per gene length (total number per gene (row))

Perform TPM normalization


# Normalization method 2: TPM (Transcrpits per million)

# i) Normalize per gene length (total number per gene (row))

# ii) Normalize by read depth (total number per replica (column))

Compare both methods - get sum of normalized reads for each column in the RPKM matrix

# Compare both methods - get total sums per column in the TPM

Compare both methods - get sum of normalized reads for each column in the TPM matrix

# Compare both methods - get total sums per column in the TPM