Data lake management: challenges and opportunities

F Nargesian, E Zhu, RJ Miller, KQ Pu… - Proceedings of the VLDB …, 2019 - dl.acm.org
The ubiquity of data lakes has created fascinating new challenges for data management
research. In this tutorial, we review the state-of-the-art in data management for data lakes …

Futzing and moseying: Interviews with professional data analysts on exploration practices

S Alspaugh, N Zokaei, A Liu, C Jin… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
We report the results of interviewing thirty professional data analysts working in a range of
industrial, academic, and regulatory environments. This study focuses on participants' …

Open data integration

RJ Miller - Proceedings of the VLDB Endowment, 2018 - dl.acm.org
Open data plays a major role in supporting both governmental and organizational
transparency. Many organizations are adopting Open Data Principles promising to make …

Modeling metadata in data lakes—A generic model

R Eichler, C Giebler, C Gröger, H Schwarz… - Data & knowledge …, 2021 - Elsevier
Data contains important knowledge and has the potential to provide new insights. Due to
new technological developments such as the Internet of Things, data is generated in …

[PDF][PDF] Ground: A Data Context Service.

JM Hellerstein, V Sreekanti, JE Gonzalez, J Dalton… - CIDR, 2017 - Citeseer
Ground is an open-source data context service, a system to manage all the information that
informs the use of data. Data usage has changed both philosophically and practically in the …

Organizing data lakes for navigation

F Nargesian, KQ Pu, E Zhu… - Proceedings of the …, 2020 - dl.acm.org
We consider the problem of creating an effective navigation structure over a data lake. We
define an organization as a navigation graph that contains nodes representing sets of …

HANDLE-A generic metadata model for data lakes

R Eichler, C Giebler, C Gröger, H Schwarz… - Big Data Analytics and …, 2020 - Springer
The substantial increase in generated data induced the development of new concepts such
as the data lake. A data lake is a large storage repository designed to enable flexible …

Architectural support for translation table management in large address space machines

J Huck, J Hays - Proceedings of the 20th annual international …, 1993 - dl.acm.org
Virtual memory page translation tables provide mappings from virtual to physical addresses.
When the hardware controlled Translation Lookaside Buffers (TLBs) do not contain a …

Provdb: Lifecycle management of collaborative analysis workflows

H Miao, A Chavan, A Deshpande - Proceedings of the 2nd Workshop on …, 2017 - dl.acm.org
As data-driven methods are becoming pervasive in a wide variety of disciplines, there is an
urgent need to develop scalable and sustainable tools to simplify the process of data …

Study on the interaction between big data and artificial intelligence

J Li, Z Ye, C Zhang - Systems Research and Behavioral …, 2022 - Wiley Online Library
The explosive growth of information has rapidly ushered people into the era of big data. Due
to the large volume, high variety, and rapid velocity characteristics of big data, most …