[图书][B] Data cleaning

IF Ilyas, X Chu - 2019 - books.google.com
This is an overview of the end-to-end data cleaning process. Data quality is one of the most
important problems in data management, since dirty data often leads to inaccurate data …

Data preparation: A technological perspective and review

AAA Fernandes, M Koehler, N Konstantinou… - SN Computer …, 2023 - Springer
Data analysis often uses data sets that were collected for different purposes. Indeed, new
insights are often obtained by combining data sets that were produced independently of …

Robotic process mining: vision and challenges

V Leno, A Polyvyanyy, M Dumas, M La Rosa… - Business & Information …, 2021 - Springer
Robotic process automation (RPA) is an emerging technology that allows organizations
automating repetitive clerical tasks by executing scripts that encode sequences of fine …

Holodetect: Few-shot learning for error detection

A Heidari, J McGrath, IF Ilyas… - Proceedings of the 2019 …, 2019 - dl.acm.org
We introduce a few-shot learning framework for error detection. We show that data
augmentation (a form of weak supervision) is key to training high-quality, ML-based error …

[PDF][PDF] The Data Civilizer System.

D Deng, RC Fernandez, Z Abedjan, S Wang… - Cidr, 2017 - cs.rutgers.edu
In many organizations, it is often challenging for users to find relevant data for specific tasks,
since the data is usually scattered across the enterprise and often inconsistent. In fact, data …

Raha: A configuration-free error detection system

M Mahdavi, Z Abedjan, R Castro Fernandez… - Proceedings of the …, 2019 - dl.acm.org
Detecting erroneous values is a key step in data cleaning. Error detection algorithms usually
require a user to provide input configurations in the form of rules or statistical parameters …

Baran: Effective error correction via a unified context representation and transfer learning

M Mahdavi, Z Abedjan - Proceedings of the VLDB Endowment, 2020 - dl.acm.org
Traditional error correction solutions leverage handmaid rules or master data to find the
correct values. Both are often amiss in real-world scenarios. Therefore, it is desirable to …

Foofah: Transforming data by example

Z Jin, MR Anderson, M Cafarella… - Proceedings of the 2017 …, 2017 - dl.acm.org
Data transformation is a critical first step in modern data analysis: before any analysis can be
done, data from a variety of sources must be wrangled into a uniform format that is amenable …

Auto-suggest: Learning-to-recommend data preparation steps using data science notebooks

C Yan, Y He - Proceedings of the 2020 ACM SIGMOD International …, 2020 - dl.acm.org
Data preparation is widely recognized as the most time-consuming process in modern
business intelligence (BI) and machine learning (ML) projects. Automating complex data …

Automating data preparation: Can we? should we? must we?

N Paton - … , Languages and Analytical Processing of Big …, 2019 - research.manchester.ac.uk
Obtaining value from data through analysis often requires significant prior effort on data
preparation. Data preparation covers the discovery, selection, integration and cleaning of …