Image credit: 'Abacus' ca. 1946: textile, Paul Rand

Lecture slides on iCloud

Homework #2: Introduction to the Tidyverse (~4hrs) - due by start of class, Friday, May 1st.


Now that we’ve aligned our reads, it’s time to discuss units for measuring gene expression. We’ll discuss differences between RPKM and TPM, and how these units relate to basic properties of your reference file and data. We’ll also discuss normalization within and between samples. To conclude this class, we’ll fire up RStudio and take a look at our first script.

Learning objectives

  • Review steps from last class (using Kallisto).
  • Discuss output from Kallisto and units of measurement for RNAseq and ‘normalization’
  • Start an RStudio Project directory that we’ll use for the rest of the course.
  • Open and discuss our first script, including installation of packages


Step 1 script

Lecture video

Part 1 - Measuring digital gene expression

Part 2 - Starting our R project and step 1 script


The RNA-seq abundance zoo - lab post by Rob Patro (developer of Salfish and Salmon software) that describes units for RNAseq, and has a nice description of ‘effective length’ for transcripts.

What the FPKM? - lab post by Harold Pimentel discussing within sample normalization and the meaning of RNAseq expression units

Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory in Biosciences, Dec 2012

Between sample normalization in RNAseq - another great lab post from Harold Pimentel on between-sample normalization.