About me

I am a PhD candidate in Computer Science at Northeastern University advised by Olga Vitek. My research develops statistical and machine learning methods for computational biology, with a particular focus on mass spectrometry-based proteomics, measurement-aware modeling, and causal inference for biological systems.

My work centers on building methods that account for the realities of biological experiments: complex measurement processes, limited replication, missing data, and variable data quality. I develop models that explicitly incorporate these features into downstream inference, and I translate them into practical open-source software used by both academic and pharmaceutical researchers.

I am a lead developer and maintainer of the MSstats ecosystem of Bioconductor packages for quantitative proteomics, including contributions to tools for differential abundance analysis, post-translational modification analysis, limited proteolysis experiments, scalable workflows, and interactive analysis interfaces. My recent work on MSstats+ introduces quality-aware statistical modeling for large-scale DIA proteomics by integrating longitudinal peak quality metrics into protein-level inference.

More broadly, I am developing causal modeling frameworks for biological systems that integrate proteomics, transcriptomics, and prior biological knowledge to estimate the effects of perturbations. This work combines Bayesian modeling, probabilistic programming, and structured biological knowledge to support interventional prediction in complex molecular systems.

My research has been shaped by close collaborations with Genentech, Pfizer, AstraZeneca, and Talus Bio, and I am especially interested in problems at the interface of statistical methodology, machine learning, and real experimental workflows in the life sciences.

Research interests

  • Statistical methods for quantitative proteomics
  • Measurement quality-aware inference
  • Causal modeling of biological systems
  • Multi-omics integration
  • Open-source scientific software

Community and teaching

I regularly teach short courses and workshops in quantitative proteomics, statistical modeling, and computational biology. Across international training programs, I have taught more than 20 courses for researchers from academia and industry.

Background

Before beginning my PhD, I worked in industry as a data scientist in Boston. I previously earned an M.S. in Data Science from Northeastern University and a B.A. in Economics from Union College.

Outside of research, I’m passionate about the outdoors, including surfing and snowboarding whenever I can. Check out the photos tab!

Contact

Please get in touch at kohler.d@northeastern.edu.