Martin Rohbeck, Arber Qoku, Tim Treis, Fabian J Theis, Britta Velten, Florian Buettner, Oliver Stegle
The analysis and integration of multi-omics datasets requires flexible modelling choices to faithfully capture the underlying biological processes that are active in one or multiple omics layers. Factor analysis is among the most successful approaches for this task, yet adapting this model class to specific biological questions and datasets is a time consuming step that has resulted in” reinventing the wheel”. Here, we present Cellij, a versatile factor analysis framework for rapidly building and training a wide range of factor analysis models for multi-omics data. By demonstrating how the framework unifies dozens of previously distinct factor analysis models, Cellij enables to perform objective benchmarks, which we use to present a study of alternative sparsity assumptions for the first time. Finally, we illustrate how Cellij integrates covariates through Gaussian Processes on a real-world transcriptomic dataset–enhancing the interpretability of the resulting latent factors.