Obtaining well calibrated probabilities using bayesian binning

MP Naeini, G Cooper, M Hauskrecht - Proceedings of the AAAI …, 2015 - ojs.aaai.org
Learning probabilistic predictive models that are well calibrated is critical for many prediction
and decision-making tasks in artificial intelligence. In this paper we present a new non …

Optimal thresholding of classifiers to maximize F1 measure

ZC Lipton, C Elkan, B Naryanaswamy - … 15-19, 2014. Proceedings, Part II …, 2014 - Springer
This paper provides new insight into maximizing F1 measures in the context of binary
classification and also in the context of multilabel classification. The harmonic mean of …

A survey on learning to reject

XY Zhang, GS Xie, X Li, T Mei… - Proceedings of the IEEE, 2023 - ieeexplore.ieee.org
Learning to reject is a special kind of self-awareness (the ability to know what you do not
know), which is an essential factor for humans to become smarter. Although machine …

Estimating treatment effect heterogeneity in randomized program evaluation

K Imai, M Ratkovic - 2013 - projecteuclid.org
When evaluating the efficacy of social programs and medical treatments using randomized
experiments, the estimated overall average causal effect alone is often of limited value and …

Learning with confident examples: Rank pruning for robust classification with noisy labels

CG Northcutt, T Wu, IL Chuang - arXiv preprint arXiv:1705.01936, 2017 - arxiv.org
Noisy PN learning is the problem of binary classification when training examples may be
mislabeled (flipped) uniformly with noise rate rho1 for positive examples and rho0 for …

Adversarial time-to-event modeling

P Chapfuwa, C Tao, C Li, C Page… - International …, 2018 - proceedings.mlr.press
Modern health data science applications leverage abundant molecular and electronic health
data, providing opportunities for machine learning to build statistical models to support …

Deep learning for patient-specific kidney graft survival analysis

M Luck, T Sylvain, H Cardinal, A Lodi… - arXiv preprint arXiv …, 2017 - arxiv.org
An accurate model of patient-specific kidney graft survival distributions can help to improve
shared-decision making in the treatment and care of patients. In this paper, we propose a …

On the inference calibration of neural machine translation

S Wang, Z Tu, S Shi, Y Liu - arXiv preprint arXiv:2005.00963, 2020 - arxiv.org
Confidence calibration, which aims to make model predictions equal to the true correctness
measures, is important for neural machine translation (NMT) because it is able to offer useful …

Learning optimized risk scores

B Ustun, C Rudin - Journal of Machine Learning Research, 2019 - jmlr.org
Risk scores are simple classification models that let users make quick risk predictions by
adding and subtracting a few small numbers. These models are widely used in medicine …

Calibrated structured prediction

V Kuleshov, PS Liang - Advances in Neural Information …, 2015 - proceedings.neurips.cc
In user-facing applications, displaying calibrated confidence measures---probabilities that
correspond to true frequency---can be as important as obtaining high accuracy. We are …