Data lakes: A survey of functions and systems

R Hai, C Koutras, C Quix… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Data lakes are becoming increasingly prevalent for Big Data management and data
analytics. In contrast to traditional 'schema-on-write'approaches such as data warehouses …

A deep dive into common open formats for analytical dbmss

C Liu, A Pavlenko, M Interlandi, B Haynes - Proceedings of the VLDB …, 2023 - dl.acm.org
This paper evaluates the suitability of Apache Arrow, Parquet, and ORC as formats for
subsumption in an analytical DBMS. We systematically identify and explore the high-level …

[HTML][HTML] Data Lakehouse: A survey and experimental study

AA Harby, F Zulkernine - Information Systems, 2025 - Elsevier
Efficient big data management is a dire necessity to manage the exponential growth in data
generated by digital information systems to produce usable knowledge. Structured …

ClickHouse-Lightning Fast Analytics for Everyone

R Schulze, T Schreiber, I Yatsishin… - Proceedings of the …, 2024 - dl.acm.org
Over the past several decades, the amount of data being stored and analyzed has increased
exponentially. Businesses across industries and sectors have begun relying on this data to …

The Lakehouse: State of the Art on Concepts and Technologies

J Schneider, C Gröger, A Lutsch, H Schwarz… - SN Computer …, 2024 - Springer
In the context of data analytics, so-called lakehouses refer to novel variants of data platforms
that attempt to combine characteristics of data warehouses and data lakes. In this way …

Data Management in the Noisy Intermediate-Scale Quantum Era

R Hai, SH Hung, T Coopmans, F Geerts - arXiv preprint arXiv:2409.14111, 2024 - arxiv.org
Quantum computing has emerged as a promising tool for transforming the landscape of
computing technology. Recent efforts have applied quantum techniques to classical …

[PDF][PDF] LST-Bench: Benchmarking Log-Structured Tables in the Cloud

J Camacho-Rodríguez, A Agrawal… - Proceedings of the …, 2024 - dl.acm.org
Data processing engines increasingly leverage distributed file systems for scalable, cost-
effective storage. While the Apache Parquet columnar format has become a popular choice …

Research data management in institutional repositories: an architectural approach using data lakehouses

Z He, W Fang - Digital Library Perspectives, 2024 - emerald.com
Purpose This paper aims to address the pressing challenges in research data management
within institutional repositories, focusing on the escalating volume, heterogeneity and multi …

The evolution of data storage architectures: examining the secure value of the Data Lakehouse

N Janssen, T Ilayperuma, J Jayasinghe… - Journal of Data …, 2024 - Springer
The digital shift in society is making continuous growth of data. However, choosing a
suitable storage architecture to efficiently store, process, and manage data from numerous …

An approach to on-demand extension of multidimensional cubes in multi-model settings: Application to IoT-based agro-ecology

S Bimonte, FA Coulibaly, S Rizzi - Data & Knowledge Engineering, 2024 - Elsevier
Managing unstructured and heterogeneous data, integrating them, and enabling their
analysis are among the key challenges in data ecosystems, together with the need to …