Set working directory

# Set working directory
setwd("/Volumes/bioinfomatics$/jurtasun/Courses/CBW2022/LMS_RNASeq/course/exercises")
getwd()
## [1] "/Volumes/bioinfomatics$/jurtasun/Courses/CBW2022/LMS_RNASeq/course/exercises"
  1. Generate fake data

# Normalize read counts for remove bias related to:
# i) Sequence depth - sequencing runs with higher depths will have more reads mapping to each gene ("per million")
# ii) Gene length - longer genes (measured in kilobases of exons) will have more reads mapping to them

# Generate some fake data - 3 replicates and 4 genes, following the example:
  1. Perform RPKM normalization

# Normalization method 1: RPKM (Reads Per Kilobase of exon and per Million reads)

# i) Normalize by read depth (total number per replica (column))

# ii) Normalize per gene length (total number per gene (row))
  1. Perform TPM normalization

# Normalization method 2: TPM (Transcrpits per million)

# i) Normalize per gene length (total number per gene (row))

# ii) Normalize by read depth (total number per replica (column))
  1. Compare both methods - get sum of normalized reads for each column in the RPKM matrix
# Compare both methods - get total sums per column in the TPM
  1. Compare both methods - get sum of normalized reads for each column in the TPM matrix
# Compare both methods - get total sums per column in the TPM