Towards reliable interactive data cleaning: A user survey and recommendations

C Qin, L Zhang, Y Cheng, R Zha, D Shen… - arXiv preprint arXiv …, 2023 - arxiv.org

In today's competitive and fast-evolving business environment, it is a critical time for
organizations to rethink how to make talent-related decisions in a quantitative manner …

被引用次数：33 相关文章所有 2 个版本

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

N Sambasivan, S Kapania, H Highfill… - proceedings of the …, 2021 - dl.acm.org

AI models are increasingly applied in high-stakes domains like health and conservation.
Data quality carries an elevated significance in high-stakes AI due to its heightened …

被引用次数：750 相关文章

[HTML] mdpi.com

[HTML][HTML] A review on human–AI interaction in machine learning and insights for medical applications

M Maadi, H Akbarzadeh Khorshidi… - International journal of …, 2021 - mdpi.com

Objective: To provide a human–Artificial Intelligence (AI) interaction review for Machine
Learning (ML) applications to inform how to best combine both human domain expertise and …

被引用次数：83 相关文章所有 10 个版本

[PDF] columbia.edu

Activeclean: Interactive data cleaning for statistical modeling

S Krishnan, J Wang, E Wu, MJ Franklin… - Proceedings of the …, 2016 - dl.acm.org

Analysts often clean dirty data iteratively--cleaning some data, executing the analysis, and
then cleaning more data based on the results. We explore the iterative cleaning process in …

被引用次数：308 相关文章所有 17 个版本

[PDF] arxiv.org

Boostclean: Automated error detection and repair for machine learning

S Krishnan, MJ Franklin, K Goldberg, E Wu - arXiv preprint arXiv …, 2017 - arxiv.org

Predictive models based on machine learning can be highly sensitive to data error. Training
data are often combined with a variety of different sources, each susceptible to different …

被引用次数：118 相关文章所有 7 个版本

[HTML] sciencedirect.com

[HTML][HTML] AutoML: A systematic review on automated machine learning with neural architecture search

I Salehin, MS Islam, P Saha, SM Noman, A Tuni… - Journal of Information …, 2024 - Elsevier

Abstract AutoML (Automated Machine Learning) is an emerging field that aims to automate
the process of building machine learning models. AutoML emerged to increase productivity …

被引用次数：27 相关文章所有 2 个版本

[PDF] aaai.org

Get a head start: On-demand pedagogical policy selection in intelligent tutoring

G Gao, X Yang, M Chi - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Reinforcement learning (RL) is broadly employed in human-involved systems to enhance
human outcomes. Off-policy evaluation (OPE) has been pivotal for RL in those realms since …

被引用次数：3 相关文章所有 2 个版本

[PDF] acm.org

Priu: A provenance-based approach for incrementally updating regression models

Y Wu, V Tannen, SB Davidson - Proceedings of the 2020 ACM SIGMOD …, 2020 - dl.acm.org

The ubiquitous use of machine learning algorithms brings new challenges to traditional
database problems such as incremental view update. Much effort is being put in better …

被引用次数：38 相关文章所有 5 个版本

[PDF] arxiv.org

Alphaclean: Automatic generation of data cleaning pipelines

S Krishnan, E Wu - arXiv preprint arXiv:1904.11827, 2019 - arxiv.org

The analyst effort in data cleaning is gradually shifting away from the design of hand-written
scripts to building and tuning complex pipelines of automated data cleaning libraries. Hyper …

被引用次数：52 相关文章所有 2 个版本

[PDF] acm.org

Data smells in public datasets

A Shome, L Cruz, A Van Deursen - … of the 1st International Conference on …, 2022 - dl.acm.org

The adoption of Artificial Intelligence (AI) in high-stakes domains such as healthcare, wildlife
preservation, autonomous driving and criminal justice system calls for a data-centric …

被引用次数：21 相关文章所有 7 个版本

高级搜索

QQ 群