Population genetics aims to understand how the observed genetic diversity emerged. In population genetics, many theoretical results have been developed in times where not much genomic and genetic data were available. These theory-driven results are still essential for our research, but data-driven discoveries have meanwhile dramatically changed our view of evolution and ecology, in particular for bacteria.
The vast amount of newly sequenced genetic data leads to a multitude of interesting applications in the emerging field of machine learning in population genetics. The main challenge is that sequence data are not independent of one another, but rather are linked by their phylogenetic relationship, often represented by a tree sequence. Thus independent training data generation relies heavily on simulation procedures. I will present some of our approaches to develop, analyze, and apply supervised machine learning tools that can use this phylogenetic relationship to improve our understanding of bacterial genome evolution.
Franz is Head of the Independent Research Group "Mathematical and Computational Population Genetics," a joint group of Tübingen's Excellence Clusters "Controlling Microbes to Fight Infections" and "Machine Learning." Franz's research focuses on mathematical models for the evolution of microbes. His group investigates how machine learning can leverage phylogenetic information in population genetics.