Automating exploratory data analysis via machine learning: An overview

T Milo, A Somech - Proceedings of the 2020 ACM SIGMOD international …, 2020 - dl.acm.org
Exploratory Data Analysis (EDA) is an important initial step for any knowledge discovery
process, in which data scientists interactively explore unfamiliar datasets by issuing a …

Data lifecycle challenges in production machine learning: a survey

N Polyzotis, S Roy, SE Whang, M Zinkevich - ACM SIGMOD Record, 2018 - dl.acm.org
Machine learning has become an essential tool for gleaning knowledge from data and
tackling a diverse set of computationally hard tasks. However, the accuracy of a machine …

A structured review of data management technology for interactive visualization and analysis

L Battle, C Scheidegger - IEEE transactions on visualization …, 2020 - ieeexplore.ieee.org
In the last two decades, interactive visualization and analysis have become a central tool in
data-driven decision making. Concurrently to the contributions in data visualization …

Data preparation: A survey of commercial tools

M Hameed, F Naumann - ACM SIGMOD Record, 2020 - dl.acm.org
Raw data are often messy: they follow different encodings, records are not well structured,
values do not adhere to patterns, etc. Such data are in general not fit to be ingested by …

Slice finder: Automated data slicing for model validation

Y Chung, T Kraska, N Polyzotis, KH Tae… - 2019 IEEE 35th …, 2019 - ieeexplore.ieee.org
As machine learning (ML) systems become democratized, it becomes increasingly important
to help users easily debug their models. However, current data tools are still primitive when …

Automatically generating data exploration sessions using deep reinforcement learning

O Bar El, T Milo, A Somech - Proceedings of the 2020 ACM SIGMOD …, 2020 - dl.acm.org
Exploratory Data Analysis (EDA) is an essential yet highly demanding task. To get a head
start before exploring a new dataset, data scientists often prefer to view existing EDA …

Data provenance

B Glavic - Foundations and Trends® in Databases, 2021 - nowpublishers.com
Data provenance has evolved from a niche topic to a mainstream area of research in
databases and other research communities. This article gives a comprehensive introduction …

Next-step suggestions for modern interactive data analysis platforms

T Milo, A Somech - Proceedings of the 24th ACM SIGKDD International …, 2018 - dl.acm.org
Modern Interactive Data Analysis (IDA) platforms, such as Kibana, Splunk, and Tableau, are
gradually replacing traditional OLAP/SQL tools, as they allow for easy-to-use data …

Database learning: Toward a database that becomes smarter every time

Y Park, AS Tajik, M Cafarella, B Mozafari - Proceedings of the 2017 ACM …, 2017 - dl.acm.org
In today's databases, previous query answers rarely benefit answering future queries. For
the first time, to the best of our knowledge, we change this paradigm in an approximate …

Automated data slicing for model validation: A big data-ai integration approach

Y Chung, T Kraska, N Polyzotis, KH Tae… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
As machine learning systems become democratized, it becomes increasingly important to
help users easily debug their models. However, current data tools are still primitive when it …