Lists of differentially expressed transcripts often include different patterns or modules of genes that are coordinately regulated across treatments or conditions, and these patterns can provide powerful insight into biology. In this class you’ll use correlation-based clustering methods and heatmap visualization to interrogate DEGs to reveal modules of co-regulated genes.
- Be able to interpret and construct heatmaps
- Understand color choice in R
- Understand how clustering methods are used to identify coordinately expressed genes (a.k.a modules)
- Learn to use the command-line clust program
Intro to clustering and starting the Step 6 script
Making and interpreting heatmaps
Alternative clustering methods
If you want to try Clust on your own, you’ll need to install the program first (see github page for instructions). You can test it out using the Schisto hackdash dataset to reproduce what I showed you in class. There are two files you’ll need for this: 1) this text file of DEGs in female LE-strain worms; and 2) a reps file that maps the columns (samples) to groups (conditions). You’ll need to read these two files into your R environment before running Clust. You can view an example of the output here
Colors in R - Colors palettes are an often underappreciated aspect of making beautiful and informative plots in R. You can access a suite of color palettes using the RColorBrewer package. These palettes can be viewed in this cheatsheet. Unfortunately, these standard palettes often don’t cut it, and you’ll need custom palettes. For this, I love using Sip to pick, organize and access color palettes.