DIY.transcriptomics

Data labs

All labs are held in-person on the campus of the University of Pennsylvania, and use real datasets from infectious disease research to hone and expand your computational skills. Virtual options are available through the course Discord page.

How do labs work for this class?

precourse material (required)

Learn more about our in-person labs, how you can participate in-person or virtually, and what you'll learn along the way.

Creating and using shell scripts and loops

Lab 1 • January 27, 2027

Using command-line tools often requires that you run similar code for each of your samples (e.g. read mapping). In this lab, you'll learn how to automate this redundant process using a simple code-aware text editor, making it possible for you to get work done even when you're not sitting in front of your computer. How great is that?!

Project management made easy with Git and Github

Lab 2 • February 3, 2027

Your working directory is already starting to get messy, and the proliferation of files and file-types will only continue throughout the course. It's time to discuss best practices for managing an active coding project using the version control system, Git, and the related web resource, Github.

Annotating gene expression data

Lab 3 • February 10, 2027

At some point we all have to wrestle with gene annotations – that is, all the stuff we can label a gene with. In this lab, you'll learn to access a world of gene-centric annotation data and will practice on gene expression data from non-model organisms.

Dumpster diving in RNA-seq data

Lab 4 • February 17, 2027

What about those reads that didn't map to the human reference? In this lab you'll learn to make the most from your RNA-seq data by digging through these 'junk' unmapped reads. It turns out that most RNA-seq studies are 'metatranscriptomes'.

Making sense of multivariate data

Lab 5 • February 24, 2027

Explore a large and multivariate dataset generated from the helmith parasite, Schistosoma mansoni, an important pathogen of humans. You'll use dimensional reduction to understand how factors like sex, developmental stage, genetic strain and drug treatment contribute to differences in gene expression.

Using generative and agentic AI to supercharge your data analysis projects

Lab 6 • March 3, 2027

Artificial Intelligence has revolutionized how we interact with code and, increasingly, bioinformatics tools and data analyses. In this lab, you'll learn how to use the AI 'pair programmer' called Github Copilot to much more rapidly and seemlessly start new coding projects. In the second half of lab, we'll use the command-line interface for Google's Gemini v2.5 to see whether agentic AI can carry out a complete RNA-seq analysis using only a detailed prompt.

No lab this week

University Spring Break • March 10, 2027

Relax and enjoy the time off. We'll see you back here next week!

Group project 2 - A single cell view of the intestine

Lab 10/11/12 • April 7, 14, and 21, 2027

At this point, you've learned the basics of processing and analyzing scRNA-seq data. In this lab, we'll put those skills to the test by exploring GutPath – a recently released single cell atlas of the small intestine.