Robustifying genomic classifiers to batch effects via ensemble learning

Y Zhang, P Patil, WE Johnson, G Parmigiani - Bioinformatics, 2021 - academic.oup.com
Motivation Genomic data are often produced in batches due to practical restrictions, which
may lead to unwanted variation in data caused by discrepancies across batches. Such …

Transfer learning via random forests: A one-shot federated approach

P Xiang, L Zhou, L Tang - Computational Statistics & Data Analysis, 2024 - Elsevier
A one-shot federated transfer learning method using random forests (FTRF) is developed to
improve the prediction accuracy at a target data site by leveraging information from auxiliary …

Multi-study R-learner for Heterogeneous Treatment Effect Estimation

C Shyr, B Ren, P Patil, G Parmigiani - arXiv preprint arXiv:2306.01086, 2023 - arxiv.org
We propose a general class of algorithms for estimating heterogeneous treatment effects on
multiple studies. Our approach, called the multi-study R-learner, generalizes the R-learner to …

Optimal ensemble construction for multistudy prediction with applications to mortality estimation

G Loewinger, RA Nunez, R Mazumder… - Statistics in …, 2024 - Wiley Online Library
It is increasingly common to encounter prediction tasks in the biomedical sciences for which
multiple datasets are available for model training. Common approaches such as pooling …

Merging or ensembling: integrative analysis in multiple neuroimaging studies

Y Shan, C Huang, Y Li, H Zhu - Biometrics, 2024 - academic.oup.com
The aim of this paper is to systematically investigate merging and ensembling methods for
spatially varying coefficient mixed effects models (SVCMEM) in order to carry out integrative …

Batch normalization followed by merging is powerful for phenotype prediction integrating multiple heterogeneous studies

Y Gao, F Sun - PLOS Computational Biology, 2023 - journals.plos.org
Heterogeneity in different genomic studies compromises the performance of machine
learning models in cross-study phenotype predictions. Overcoming heterogeneity when …

[HTML][HTML] Hierarchical resampling for bagging in multistudy prediction with applications to human neurochemical sensing

G Loewinger, P Patil, KT Kishida… - The annals of applied …, 2022 - ncbi.nlm.nih.gov
We propose the “study strap ensemble”, which combines advantages of two common
approaches to fitting prediction models when multiple training datasets (“studies”) are …

A pairwise strategy for imputing predictive features when combining multiple datasets

Y Wu, B Ren, P Patil - Bioinformatics, 2023 - academic.oup.com
Motivation In the training of predictive models using high-dimensional genomic data,
multiple studies' worth of data are often combined to increase sample size and improve …

Multi-source domain adaptation for regression

Y Wu, G Parmigiani, B Ren - arXiv preprint arXiv:2312.05460, 2023 - arxiv.org
Multi-source domain adaptation (DA) aims at leveraging information from more than one
source domain to make predictions in a target domain, where different domains may have …

Generalizing Treatment Effect to a Target Population Without Individual Patient Data in a Real‐World Setting

H Quan, T Li, X Chen, G Li - Pharmaceutical Statistics, 2024 - Wiley Online Library
The innovative use of real‐world data (RWD) can answer questions that cannot be
addressed using data from randomized clinical trials (RCTs). While the sponsors of RCTs …