
This document contains the final project for the “Introduction to RNA-Seq” module, part of the Bioinformatics and Statistics II class given at LCG-UNAM in February 2022. The original class material from Dr. Leonardo Collado-Torres can be found here.

In this project, we will use the recount3 R package to download RNA-Seq data from The Cancer Genome Atlas project, obtaining all the BRCA samples. We will then evaluate the quality of the downloaded data, explore the available sample attributes, select a subset of these to build a statistical model, and perform differential expression analysis.



We will need R 4.1.x, which can be installed from CRAN, and the following packages.


# To download data and generate RangedSummarizedExperiment data object  

# To access data on tumor subtype 

# To normalize counts

# To plot results 

# For differential expression analysis
## Reproducibility information
