Material for the Course "Introduction to genome-wide association studies (GWAS)"
Instructors: Filippo Biscarini, Oscar Gonzalez-Recio, Christian Werner
This course will introduce students, researchers and professionals to the steps needed to build an analysis pipeline for Genome-Wide Association Studies (GWAS). The course will describe all the necessary steps involved in a typical GWAS study, which will then be used to build a reusable and reproducible bioinformatics pipeline.
Each day the course will start at 14:00 and end at 20:00 (CET). As a general rule, we'll have a longer break (30 minutes) at 16:00 and two shorter breaks (10-15 minutes) later on during the day (to be decided flexibly depending on the sessions).
Day 1
- Lecture 0 General Introduction / Overview of the Course [Filippo, Oscar, Christian]
- Lecture 1 GWAS Overview: Case Studies / Examples from Literature [Oscar]
- Lecture 2 Introduction to GWAS: Linkage Disequilibrium and Linear Regression [Oscar]
- Lab 1 (Demonstration) GWAS: Basic Models (Linear and Logistic Regression) [Oscar]
- Lab 2 - Description of Datasets [Christian]
- Course Manual
- GWAS Workflow
Day 2
- Lecture 3 The Multiple Testing Issue [Oscar]
- Lecture 4 Statistical Power, Population Stratification and Experimental Design [Oscar]
- Lecture 5 Initial Data Analysis, Exploratory Data Analysis and Data Pre-Processing [Christian]
- Lab 3 GWAS: a first simple exercise for you! [Christian, Filippo]
Day 3
- Lab 4 Data filtering and mean/median imputation in R [Filippo]
- Lab 5 GWAS: The Stand-Alone Script(s) for the Full Model [Filippo]
- Lecture 6 KNN Imputation
- Lab 6 (Demonstration) KNNI Imputation [Filippo]
- knni_illustration.Rmd
- [data_for_KNNI_illustration]
- knni_tidymodels.R
- [02_knni.sh] [support script]
- [hamming.R] [support script]
- [knni.R] [support script]
- Lecture 7 Working in the shell [Christian]
- Lecture 8 Imputation of Missing Genotypes [Christian]
- Lab 7 Imputation of Missing Genotypes using Beagle [Christian]
Day 4
- Lecture 9 Brief Intermission:
- Lab 8 Revising the Steps involved in GWAS [Filippo]
- Lab 9 Introducing the Exercise [Filippo]
- Collaborative Exercise: let's build our own GWAS workflow on new data. Pig (Sus scrofa) data. [Filippo, Oscar, Christian]
- Part 1: Individual/Group Break-Out Sessions to give it a try independetly
- Part 2: Whole-Group Revision of the Exercise: step-by-step (1.get_data; 2.filter; 3.imputation; 4.GWAS)
- exercise solutions + tips
- Bonus exercise [Optional] (Parus major data)
Day 5
-
Lecture 10 A light Touch on Post-GWAS Analysis: Inferring Functionality [Oscar]
-
Lecture 11 GWAS Model Extensions: [Filippo, Christian, Oscar]
- 12.1 GWAS Model Extensions_Dominance_and_other_genotype_Codifications
- 12.2 GWAS Model Extensions_Polyploids
- [12.3 GWAS Model Extensions_Trait_Types]
- [12.4 GWAS Model Extensions_Multi-Trait-Locus, software]
- 12.5 A bioinformatic pipeline for GWAS
- 12.6 Additional software for GWAS
- [R code GWASpoly (vignette)]
- R code GWAS for categorical Traits
- R code GWAS for categorical Traits - Examples
- [R code GWAS for longitudinal Traits]
- [R code GWAS for multi-trait and multi-locus Models]
- [Snakemake pipeline for continuous phenotypes]
-
Lecture 12 Optional sessions [Filippo, Christian, Oscar]
-
Final Quiz on what we learned about GWAS! [Filippo, Oscar, Christian]
-
Conclusions and Wrap-Up Discussion on GWAS [Filippo, Oscar, Christian]
- the GWAS workflow in R
- preparatory_steps: download and prepare the data
- preprocessing: filter the data
- imputation: imputing missing genotypes
- gwas: run the GWAS models
- power_and_significance: designing GWAS experiments
- steps: identifying the individual steps involved in a GWAS study
- pipeline: assembling the individual steps into a bioinformatics pipeline for GWAS
- collaborative exercise: trying out what we learnt on new data