Overview
In this class, we’ll finally get down to the business of using Kallisto for memory-efficient mapping of raw reads to a reference transcriptome. You’ll carry out this mapping in class, right on your laptop, while we discuss what’s happening ‘under the hood’ with Kallisto and how this compares to more traditional alignment methods. You’ll be introduced to using command line software and will learn about automation and reproducibility through shell scripts.
Learning objectives
- Discuss the course dataset.
- Download and examine a reference transcriptome from Ensembl.
- Use Kallisto to construct an index from this reference file.
- Use Kallisto to map our raw reads to this index
- Talk a bit about how an index is built and facilitates read alignment
If you’re new to R
Please take time to work through this Learn R! module
Lecture videos
Part 1 - Step-by-step walkthrough of using FastQC and FastP on your raw sequence data
Part 2 - Read mapping with Kallisto, and summarizing outputs with MultiQC
Part 3 - A discussion of traditional and alignment-free (pseudoalignment) methods for quantifying gene expression
Reading
papers and labs posts on Kallisto
2016 Nature Biotech paper from Lior Pachter’s lab describing Kallisto
2017 Nature Methods paper from Lior Pachter’s lab describing Sleuth
Lior Pachter’s lab post on Kallisto
lab post on pseudoalignments - helps understand how Kallisto maps reads to transcripts
Did you notice that Kallisto is using ‘Expectation Maximization (EM)’ during the alignment? You can read more about what this is here
Kallisto discussions/questions and Kallisto announcements are available on Google groups
General info about ultra lightweight methods for transcript quantification
2014 Nature Biotech paper - describes Sailfish, which implimented the first lightweight method for quantifying transcript expression.
Not quite alignments - Rob Patro, the first author of the Sailfish paper, wrote a nice lab post comparing and contrasting alignment-free methods used by Sailfish, Salmon and Kallisto.
2018 Nature Methods paper describing Salmon - A lightweight aligment tool from Rob Patro and Carl Kinsford. Check out the website too.
2011 Nature Biotechnology - Great primer to better understand what de Bruijn graph is.
Greg Grant’s recent paper comparing different aligners. This should be a helpful guide in choosing alignment software outside of what we used in class.
Other videos
Harold Pimentel’s talk on alignment (20 min)
Lior Pachter’s talk at CSHL (45 min)