Olivier Labayle

I'm a PhD student in the Biomedical AI CDT program at the University of Edinburgh. In my current work I leverage the framework of Targeted Learning to uncover the mechanics of diseases using biobank scale data.

PhD: A targeted approach to population genetics.

My PhD project is set at the intersection of causal inference, machine-learning, and functional genetics. The originality of our work consists in two main ideas. The first is statistical; we use the modern framework of Targeted Learning developed by van der Laan et al., which is supported by asymptotic theoretical guarantees particularly well suited to modern biobanks. If you are interested in the details of the method, our preprint is available here. The second, is a strong emphasis on functional genetics, which will enable the resulting statistical estimates to be associated with a causal interpretation. Perhaps ironically, Targeted Learning requires large sample sizes to reach the asymptotic regime, but unfortunately, is also computationally intensive. Therefore, to make this project practically possible, I chose to develop most of the technological stack using the high-performance Julia language. The main software outcomes of my PhD project are thus a Julia package for general purpose Targeted Minimum Loss-based Estimation and a Nextflow pipeline called TarGene. Those two softwares are completely open source and accessible below:

TarGene

A Nextflow pipeline to estimate effect sizes of genetic variants on diseases using Biobank scale data.

TMLE.jl

A general purpose Julia package for Targeted Minimum Loss-based Estimation.