Image credit: 'Connections' by Marsha Glickman

Lecture slides

Homework: Reporting with Rmarkdown (~4hrs) - due April 24th


Now that you’ve identified differentially expressed genes, what do they mean and how do you begin to elucidate the biological pathways governed by these genes? Toward this end, you will learn how to carry out functional enichment analyses using Gene Ontology and Gene Set Enrichment Analysis (GSEA). You’ll also explore different options for how to present your functional enrichment results graphically.


  • carry out Gene Ontology (GO) enrichment analysis
  • carry out a GSEA enrichment analysis
  • understand the difference between GO and GSEA
  • understand the MSigDB resource


Step 7 script


The what, where, how and why of gene ontology – a primer for bioinformaticians. Briefings in Bioinformatics, Feb 2011

A nice blog post on the hypergeometric test and Fisher’s exact test - these statistical tests are at the core of many functional enrichment approaches.

original 2003 Nat. Methods paper describing Gene Set Enrichment Analysis (GSEA), and the 2005 PNAS paper that formally detailed its usage.

Not a fan of using the Broad Inst. GSEA program? You can carry out self-contained and competitive GSEA in R using ROAST and CAMERA, respectively.

Gene Set VARIATION Analysis (GSVA) - I find GSVA useful for producing GSEA-type results across a heterogeneous dataset (like a large cohort of patients).

The Molecular Signatures Database (MSigDB)

2016 Immunity Paper describing the creation of Immunological Signatures’ collection (C7).

mouse and human specific gene signature databases

GSEA file formats page.

You know how I feel about Venn diagrams, so if you’re interested in exploring interactions between many groups of genes, have a look at this Nature Methods paper, the accompanying R package, UpSetR, as well as the UpSet website. Note, there’s a shiny app for this as well!

Here are my Datagraph templates for making enrichment bubble plots and GSEA enrichment plots