DIY.transcriptomics

Data labs

All labs are held in-person on the campus of the University of Pennsylvania, and use real datasets from infectious disease research to hone and expand your computational skills. Virtual options are available through the course Discord page.

How do labs work for this class?

precourse material

Learn more about our in-person labs, how you can participate in-person or virtually, and what you'll learn along the way.

Software installation and IT support

Lab 1 (optional) • Jan 24, 2024

Installing software can be a real headache, so let us help you! This lab will be focused on helping you with IT support and getting to know the software tools that we'll be using throughout the course.

Creating and using shell scripts and loops

Lab 2 (required) • January 31, 2024

Using command-line tools often requires that you run similar code for each of your samples (e.g. read mapping). In this lab, you'll learn how to automate this redundant process using a simple code-aware text editor, making it possible for you to get work done even when you're not sitting in front of your computer. How great is that?!

Project management made easy with Git and Github

Lab 3 (required) • February 7, 2024

Your working directory is already starting to get messy, and the proliferation of files and file-types will only continue throughout the course. It's time to discuss best practices for managing an active coding project using the version control system, Git, and the related web resource, Github.

Annotating gene expression data

Lab 4 (required) • February 14, 2024

At some point we all have to wrestle with gene annotations – that is, all the stuff we can label a gene with. In this lab, you'll learn to access a world of gene-centric annotation data and will practice on gene expression data from non-model organisms.

Dumpster diving in RNA-seq data

Lab 5 (required) • February 21, 2024

What about those reads that didn't map to the human reference? In this lab you'll learn to make the most from your RNA-seq data by digging through these 'junk' unmapped reads. It turns out that most RNA-seq studies are 'metatranscriptomes'.

Making sense of multivariate data

Lab 6 (required) • February 28, 2024

Explore a large and multivariate dataset generated from the helmith parasite, Schistosoma mansoni, an important pathogen of humans. You'll use dimensional reduction to understand how factors like sex, developmental stage, genetic strain and drug treatment contribute to differences in gene expression.

Tracking the global animal trade

Lab 7 (required) • March 13, 2024

Think you know how to wrangle and plot data? We'll put your skills to the test using a dataframe with millions of rows. To illustrate the general utility of the methods you've learned thus far, we'll explore the import of animals and animal products into US port cities over a 14 year period.

Using generative AI to supercharge your coding

Lab 8 (required) • March 20, 2024

Artificial Intelligence is revolutionizing how we interact with code. In this lab, we'll review the solution for the last lab ('Global Animal Trade'), but will use AI to guide us. I'll demonstrate how you can use the AI 'pair programmer' called Github Copilot to much more rapidly and seemlessly start new coding projects.

The COVID19 Collaborative Challenge

Lab 9 (required) • March 27, 2024

Explore one of the first and largest transcriptomic studies of of SARS-CoV-2. You start by parsing the study metadata to identify a question you're interested in, formulate a hypothesis, and carry out an analysis of the data to test this hypothesis.

Spring Break!

Lab 10 • April 3, 2024

No lab this week. Hope you find some time to relax and recharge over the break!

The enrichment lab

Lab 11 (required) • April 10, 2024

In last week's lecture, you learned to use functional enrichment tools like GO and GSEA to identify themes in your RNA-seq data. In this lab, we'll put these important skills to the test!

The reproducibility lab

Lab 12 (required) • April 17, 2024

In this lab you'll create a custom R function that automates many of the tasks and visualizations that we've worked with throughout the course.

Open lab

Lab 13 (optional) • April 24, 2024

Struggling with something we covered recently in class, or do you want to discuss some of your own RNA-seq data? Then drop in for hands-on help from one of our amazing Teaching Assistants!

The brain during infection

Lab 14 (required) • May 1, 2024

At this point, you've learned the basics of processing and analyzing scRNA-seq data. In this lab, we'll put those skills to the test with on data from mouse brains infected with the fascinating parasite, Toxoplasma gondii.

A single cell perspective of hematopoiesis

Lab 15 (required) • May 8, 2024

To conclude the course, you'll try your hand at analysis of multi-omic single cell RNAseq/ATACseq data to understand cell lineage commitment during hematopoiesis.