J Chen, Y Huang, M Wang, S Salihoglu… - Proceedings of the VLDB …, 2022 - dl.acm.org
This paper is an experimental and analytical study of two classes of summary-based cardinality estimators that use statistics about input relations and small-size joins in the …
Big data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary …
Database systems are no longer used only for the storage of plain structured data and basic analyses. An increasing role is also played by the integration of ML models, eg, neural …
Quantum computing is an emerging technology and has yet to be exploited by industries to implement practical applications. Research has already laid the foundation for figuring out …
C Lyu, Q Fan, P Guyard, Y Diao - arXiv preprint arXiv:2403.00995, 2024 - arxiv.org
As Spark becomes a common big data analytics platform, its growing complexity makes automatic tuning of numerous parameters critical for performance. Our work on Spark …
Cardinality estimation is a fundamental task in database query processing and optimization. As shown in recent papers, machine learning (ML)-based approaches may deliver more …
Data preprocessing, the step of transforming data into a suitable format for training a model, rarely happens within database systems but rather in external Python libraries and thus …
J Chen, Y Huang, M Wang, S Salihoglu… - ACM SIGMOD …, 2023 - dl.acm.org
We study two classes of summary-based cardinality estimators that use statistics about input relations and small-size joins:(i) optimistic estimators, which were defined in the context of …
Big data query processing has become increasingly important, prompting the development and cloud deployment of numerous systems. However, automatically tuning the numerous …