Functional enrichment analysis

Image credit: 'Connections' by Marsha Glickman

Overview

Now that you’ve identified differentially expressed genes, what do they mean and how do you begin to elucidate the biological pathways governed by these genes? Toward this end, you will learn how to carry out functional enichment analyses using Gene Ontology and Gene Set Enrichment Analysis (GSEA). We’ll also explore different options for how to present your functional enrichment results graphically.

Learning objectives

Carry out Gene Ontology (GO) enrichment analysis using modules identified in the previous script
Carry out a Gene Set Enrichment Analysis (GSEA) using our full dataset
Understand the differences between GO and GSEA
Understand the MSigDB resource and how to access signature collections

Code

Step 7 script

Lecture videos

Part 1 - Introduction to enrichment analysis methods

Part 2 - GO enrichment using gprofiler and our modules

Part 3 - Carrying GSEA and plotting the results in R

Part 4 - Competitive vs self-contained GSEA, and exploring gene set variation analysis (GSVA)

Reading

The what, where, how and why of gene ontology – a primer for bioinformaticians. Briefings in Bioinformatics, Feb 2011

A nice lab post on the hypergeometric test and Fisher’s exact test - these statistical tests are at the core of many functional enrichment approaches.

Analyzing gene expression data in terms of gene sets: methodological issues - A seminal paper on the statistics of enrichment analysis in gene expression studies.

Toward a gold standard for benchmarking gene set enrichment analysis - A excellent and recent benchmarking study for enrichment tools

original 2003 Nat. Methods paper describing Gene Set Enrichment Analysis (GSEA), and the 2005 PNAS paper that formally detailed its usage.

You can carry out self-contained and competitive GSEA in R using ROAST and CAMERA, respectively.

Gene Set VARIATION Analysis (GSVA) - I find GSVA useful for producing GSEA-type results across a heterogeneous dataset (like a large cohort of patients).

The Molecular Signatures Database (MSigDB)

2016 Immunity Paper describing the creation of Immunological Signatures’ collection (C7).

You know how I feel about Venn diagrams, so if you’re interested in exploring interactions between many groups of genes, have a look at this Nature Methods paper, the accompanying R package, UpSetR, as well as the UpSet website. Note, there’s a shiny app for this as well!