Discovering association rules from big graphs

W Fan, W Fu, R Jin, P Lu, C Tian - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
This paper tackles two challenges to discovery of graph rules. Existing discovery methods
often (a) return an excessive number of rules, and (b) do not scale with large graphs given …

Hitting set enumeration with partial information for unique column combination discovery

J Birnick, T Bläsius, T Friedrich, F Naumann… - Proceedings of the …, 2020 - dl.acm.org
Unique column combinations (UCCs) are a fundamental concept in relational databases.
They identify entities in the data and support various data management activities. Still, UCCs …

IoT data cleaning techniques: A survey

X Ding, H Wang, G Li, H Li, Y Li… - Intelligent and Converged …, 2022 - ieeexplore.ieee.org
Data cleaning is considered as an effective approach of improving data quality in order to
help practitioners and researchers be devoted to downstream analysis and decision-making …

Parallel rule discovery from large datasets by sampling

W Fan, Z Han, Y Wang, M Xie - … of the 2022 international conference on …, 2022 - dl.acm.org
Rule discovery from large datasets is often prohibitively costly. The problem becomes more
staggering when the rules are collectively defined across multiple tables. To scale with large …

Discovering Top-k Rules using Subjective and Objective Criteria

W Fan, Z Han, Y Wang, M Xie - Proceedings of the ACM on Management …, 2023 - dl.acm.org
This paper studies two questions about rule discovery. Can we characterize the usefulness
of rules using quantitative criteria? How can we discover rules using those criteria? As a …

Fast approximate denial constraint discovery

R Xiao, Z Tan, H Wang, S Ma - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
We investigate the problem of discovering approximate denial constraints (DCs), for finding
DCs that hold with some exceptions to avoid overfitting real-life dirty data and facilitate data …

Fast Algorithms for Denial Constraint Discovery

EHM Pena, F Porto, F Naumann - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
Denial constraints (DCs) are an integrity constraint formalism widely used to detect
inconsistencies in data. Several algorithms have been devised to discover DCs from data …

TSDDISCOVER: Discovering Data Dependency for Time Series Data

X Ding, Y Li, H Wang, C Wang, Y Liu… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org
Intelligent devices often produce time series data that suffer from significant data quality
issues. While the utilization of data dependency in error detection and data repair has been …

Conformance constraint discovery: Measuring trust in data-driven systems

A Fariha, A Tiwari, A Radhakrishna, S Gulwani… - Proceedings of the …, 2021 - dl.acm.org
The reliability of inferences made by data-driven systems hinges on the data's continued
conformance to the systems' initial settings and assumptions. When serving data (on which …

Dynamic functional dependency discovery with dynamic hitting set enumeration

R Xiao, Z Tan, S Ma, W Wang - 2022 IEEE 38th International …, 2022 - ieeexplore.ieee.org
Functional dependencies (FDs) are widely applied in data management tasks. Since FDs on
data are usually unknown, FD discovery techniques are studied for automatically finding …