Y Zhang, ZG Ives - Proceedings of the 2020 ACM SIGMOD International …, 2020 - dl.acm.org
Many modern data science applications build on data lakes, schema-agnostic repositories of data files and data products that offer limited organization and management capabilities …
Data profiling refers to the activity of collecting data about data,{ie}, metadata. Most IT professionals and researchers who work with data have engaged in data profiling, at least …
Effective query optimization is a core feature of any database management system. While most query optimization techniques make use of simple metadata, such as cardinalities and …
S Kruse, F Naumann - Proceedings of the VLDB Endowment, 2018 - dl.acm.org
Functional dependencies (FDs) and unique column combinations (UCCs) form a valuable ingredient for many data management tasks, such as data cleaning, schema recovery, and …
Data lakes are becoming increasingly prevalent for Big Data management and data analytics. In contrast to traditional 'schema-on-write'approaches such as data warehouses …
Maintaining data consistency is known to be hard. Recent approaches have relied on integrity constraints to deal with the problem-correct and complete constraints naturally work …
Fashion retail has a large and ever-increasing popularity and relevance, allowing customers to buy anytime finding the best offers and providing satisfactory experiences in the shops …
Y Wu, J Yu, Y Tian, R Sidle, R Barber - Proceedings of the 2019 …, 2019 - dl.acm.org
Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These indexes are built …
ABSTRACT A missing value represents a piece of incomplete information that might appear in database instances. Data imputation is the problem of filling missing values by means of …