Image credit: Daniel Horowitz for NPR

Lecture slides - Although the entire class was spent working on the Step 2 script, I briefly referenced a few slides, which I’ve included here.

Homework: Introduction to the Tidyverse (~4hrs) - due before the start of class on March 6th.

Overview

We’ll begin this class by filtering and normalzing our data, all while using the ggplot2 graphing package to visualize the impact these changes have our data. You’ll also be introduced to Hadley Wickham’s philosophy of ‘tidy data’ by using the dplyr package along with other tools from his tidyverse of R packages to take control over our gene expression dataframes, allowing us to change, sort, filter, arrange and summarize large data sets quickly and easily using simple commands in R.

Goals

  • Start/finish step 2 script
  • Begin to explore our data in R using ggplot2
  • Discuss basics of Tidy data and the reshape2 package
  • Talk briefly about color palettes in R
  • Filter data to remove lowly expressed genes
  • Normalize data

Code

Step 2 script

Software

Sip - I discussed this simple but excellent color picking software in class. The program costs about $10, but provides a convenient and powerful way to assemble color palettes for use in R.

Reading

Tidy Data - Hadley Wickham (author of Tidyverse packages and Chief Scientist at RStudio) describes the philosophy of tidy data in this paper.

Grammar of graphics - Another paper by Hadley Wickham. This one explains the rationale behind ggplot2.

original TMM normalization manuscript.

catalog of R graphs - Take a look at some of the various ways to graph your data and the underlying R code in this

R Graphics Cookbook - If you end up using R to make a lot of graphs, you will find the to be an important reference. It’s available free to UPenn folks as an Ebook.

Colors palettes are an often underappreciated aspect of making beautiful and informative plots in R. You can access a suite of color palettes using the RColorBrewer package. These palettes can be viewed in this cheatsheet. Unfortunately, these standard palettes often don’t cut it, and you’ll need custom palettes. For this, I love using Sip to pick, organize and access color palettes.