Active learning literature survey

B Settles - 2009 - minds.wisconsin.edu
The key idea behind active learning is that a machine learning algorithm can achieve
greater accuracy with fewer labeled training instances if it is allowed to choose the training …

Barriers to academic data science research in the new realm of algorithmic behaviour modification by digital platforms

T Greene, D Martens, G Shmueli - Nature Machine Intelligence, 2022 - nature.com
The era of behavioural big data has created new avenues for data science research, with
many new contributions stemming from academic researchers. Yet data controlled by …

Get another label? improving data quality and data mining using multiple, noisy labelers

VS Sheng, F Provost, PG Ipeirotis - Proceedings of the 14th ACM …, 2008 - dl.acm.org
This paper addresses the repeated acquisition of labels for data items when the labeling is
imperfect. We examine the improvement (or lack thereof) in data quality via repeated …

[图书][B] Conformal prediction for reliable machine learning: theory, adaptations and applications

V Balasubramanian, SS Ho, V Vovk - 2014 - books.google.com
The conformal predictions framework is a recent development in machine learning that can
associate a reliable measure of confidence with a prediction in any real-world pattern …

Active learning: A survey

CC Aggarwal, X Kong, Q Gu, J Han, SY Philip - Data classification, 2014 - taylorfrancis.com
In all these cases, labels can be obtained, but only at a significant cost to the end user. An
important observation is that all records are not equally important from the perspective of …

Learning to maximize mutual information for dynamic feature selection

IC Covert, W Qiu, M Lu, NY Kim… - International …, 2023 - proceedings.mlr.press
Feature selection helps reduce data acquisition costs in ML, but the standard approach is to
train models with static feature subsets. Here, we consider the dynamic feature selection …

Eddi: Efficient dynamic discovery of high-value information with partial vae

C Ma, S Tschiatschek, K Palla… - arXiv preprint arXiv …, 2018 - arxiv.org
Many real-life decision-making situations allow further relevant information to be acquired at
a specific cost, for example, in assessing the health status of a patient we may decide to take …

Efficiently learning the accuracy of labeling sources for selective sampling

P Donmez, JG Carbonell, J Schneider - Proceedings of the 15th ACM …, 2009 - dl.acm.org
Many scalable data mining tasks rely on active learning to provide the most useful
accurately labeled instances. However, what if there are multiple labeling sources ('oracles' …

Repeated labeling using multiple noisy labelers

PG Ipeirotis, F Provost, VS Sheng, J Wang - Data Mining and Knowledge …, 2014 - Springer
This paper addresses the repeated acquisition of labels for data items when the labeling is
imperfect. We examine the improvement (or lack thereof) in data quality via repeated …

Efficient utilization of missing data in cost-sensitive learning

X Zhu, J Yang, C Zhang, S Zhang - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Different from previous imputation methods which impute missing values in the incomplete
samples by using the information in the complete samples, this paper proposes a Date-drive …