Design and analysis of quantitative proteomics experiments: statistical methods and case studies with MSstats

Short Course, FEBS 2023 European Summer School on Advanced Proteomics, 2023

https://advancedproteomics2023.febsevents.org/

This short course included a series of lectures and workshops related to the statistical analysis of mass spectrometry-based proteomics experiments.

Lecture

The lecture introduced the basic principles of statistical design and analysis of quantitative proteomic experiments, that aim to detect differentially abundant proteins. We discussed the importance of randomization, replication, and blocking. We also discussed basic types of statistical analyses such as hypothesis testing with t-test (null and alternative hypotheses, p-values, statistical power), and correction for multiple testing. We also reviewed extensions to the basic statistical methods such as analysis of variance (ANOVA) and Empirical Bayes moderation (limma).

Workshop

The workshop introduced more advanced statistical analysis approaches for detecting differentially abundant proteins, that are implemented in the open-source software MSstats. We discussed issues such as preparation of tables of feature intensities (filtering, normalization); sample annotation and statistical modeling for realistic proteomic experiments (multi-group experiments, combination of biological and technical replicates, repeated measures); treatment of missing values; and visual exploration of the results. The discussion was illustrated in case studies of experimental datasets, using a graphical user interface MSstatsShiny. We explored the impact of different data processing options on the downstream results, highlighting significant choices in the workflow.