Diversity in machine learning

Z Gong, P Zhong, W Hu - Ieee Access, 2019 - ieeexplore.ieee.org
Machine learning methods have achieved good performance and been widely applied in
various real-world applications. They can learn the model adaptively and be better fit for …

Feature selection in data mining

YS Kim, WN Street, F Menczer - Data mining: opportunities and …, 2003 - igi-global.com
Feature subset selection is an important problem in knowledge discovery, not only for the
insight gained from determining relevant modeling variables, but also for the improved …

Boosting ensemble accuracy by revisiting ensemble diversity metrics

Y Wu, L Liu, Z Xie, KH Chow… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Neural network ensembles are gaining popularity by harnessing the complementary wisdom
of multiple base models. Ensemble teams with high diversity promote high failure …

Learning from high-dimensional biomedical datasets: the issue of class imbalance

B Pes - IEEE Access, 2020 - ieeexplore.ieee.org
As witnessed by a vast corpus of literature, dimensionality reduction is a fundamental step
for biomedical data analysis. Indeed, in this domain, there is often the need for coping with a …

Metashift: A dataset of datasets for evaluating contextual distribution shifts and training conflicts

W Liang, J Zou - arXiv preprint arXiv:2202.06523, 2022 - arxiv.org
Understanding the performance of machine learning models across diverse data
distributions is critically important for reliable applications. Motivated by this, there is a …

Geography-aware self-supervised learning

K Ayush, B Uzkent, C Meng… - Proceedings of the …, 2021 - openaccess.thecvf.com
Contrastive learning methods have significantly narrowed the gap between supervised and
unsupervised learning on computer vision tasks. In this paper, we explore their application …

Feature subset selection bias for classification learning

SK Singhi, H Liu - Proceedings of the 23rd international conference on …, 2006 - dl.acm.org
Feature selection is often applied to high-dimensional data prior to classification learning.
Using the same training dataset in both selection and learning can result in so-called feature …

Stable prediction across unknown environments

K Kuang, P Cui, S Athey, R Xiong, B Li - proceedings of the 24th ACM …, 2018 - dl.acm.org
In many important machine learning applications, the training distribution used to learn a
probabilistic classifier differs from the distribution on which the classifier will be used to make …

Agree to disagree: Diversity through disagreement for better transferability

M Pagliardini, M Jaggi, F Fleuret… - arXiv preprint arXiv …, 2022 - arxiv.org
Gradient-based learning algorithms have an implicit simplicity bias which in effect can limit
the diversity of predictors being sampled by the learning procedure. This behavior can …

When does diversity help generalization in classification ensembles?

Y Bian, H Chen - IEEE Transactions on Cybernetics, 2021 - ieeexplore.ieee.org
Ensembles, as a widely used and effective technique in the machine learning community,
succeed within a key element—“diversity.” The relationship between diversity and …