What is the goal of this course?

As access to high-throughput sequencing technology increases, the bottleneck in biomedical research has shifted from generating data, to analyzing and integrating diverse data types. Addressing these needs requires that students and postdocs equip themselves with a toolkit for data mining and interrogation. This course focuses specifically on studying global gene expression (transcriptomics) through the use of the R programming environment and the Bioconductor suite of software packages – a versatile and robust collection of tools for bioinformatics, statistics, and plotting. During this semester-long course, students participate in a mix of virtual lectures and guided code review, all while working with real infectious disease datasets directly on their laptop. Students will learn to analyze RNAseq data using a lightweight and reusable set of modular scripts that leverage open-source software. In addition, students will learn best practices in data science for working in R/Bioconductor, including creating interactive data visualizations, making their analyses transparent and reproducible, and identifying experimental bias in large datasets.

Who teaches the class and maintains the website?

Dan Beiting designed and teaches the course. He is an Assistant Professor of Pathobiology at PennVet. Camila Amorim (postdoc), Seble Negatu and Sheridan Littleton (PhD students) are teaching assistants for the course in 2021.

Meet your instructors!

Who supports and sponsors this course?

This course is made possible in part by generous support for TA stipends from the UPenn Institute for Immunology (IFI). In addition, we thank RStudio for allowing access to RStudio Server Pro for this course, and for their continued free access to RStudio desktop, which is a critical resource for academic research in R. We also thank DataCamp for generously providing free and unrestricted access to their online learning content to all students enrolled in the course. Finally, we thank the folks at Code Ocean, who provide all students with convenient access to dockerized resources for transparency and reproducibility.

What is the format of the course?

This class is being run as a ‘hybrid’ class. Lectures are entirely virtual and all lecture videos will be posted to this website. In-class time will be devoted to working through structured labs that focus on building better data science skills using datasets from infectious disease.

What can I expect to learn?

  • Learn to analyze bulk RNAseq and single cell RNA-seq (scRNA-seq) data
  • Develop a lightweight and reusable RNAseq pipeline.
  • Learn best practices for working in R/bioconductor (extensible to other datatypes)
  • Learn the basics of ‘data science’
  • Learn how to report your analysis and results in a transparent and reproducible way

Who can take the course?

All lectures are freely available, and lab materials will be posted the website after each lab. In-person attendence for labs is available for gradate students in the Biomedical Graduate Studies (BGS) group at the University of Pennsylvania. Space permitting, the course is open to graduates students outside of BGS. If you are not a graduate student at UPenn you can still access the lectures, course slides, code, videos and reading material on the site. This course is ideal for students and postdocs who have little or no experience in bioinformatics, and we encourage students to bring their own RNAseq data to the course.

Can I just follow along online?

Yes! All lectures, reading material and code are freely available and are organized on the website by lecture, so you can proceed at your own pace. However, there are some elements of the course that are only available for people who have officially registered and participate in-person. This includes access to DataCamp for homework and extended learning, participation in our in-person labs, access to our course Code Ocean group, access to our class Slack page for 1:1 help from the instructor and our TAs throughout the course (and with your own data), and last but not least, course credit.

How will I be graded in this course?

All students who officially register for the course through UPenn will receive a letter grade. At this time we unable to provide grades or proof-of-completion for virtual learners.

Can I cite the course in my publications?

Yes! Please cite our recent open-access publication that describes the course philosophy and teaching strategies.