Big data systems: A software engineering perspective

A Davoudian, M Liu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Big Data Systems (BDSs) are an emerging class of scalable software technologies whereby
massive amounts of heterogeneous data are gathered from multiple sources, managed …

Big data in cloud computing review and opportunities

M Muniswamaiah, T Agerwala, C Tappert - arXiv preprint arXiv …, 2019 - arxiv.org
Big Data is used in decision making process to gain useful insights hidden in the data for
business and engineering. At the same time it presents challenges in processing, cloud …

[PDF][PDF] The Data Civilizer System.

D Deng, RC Fernandez, Z Abedjan, S Wang… - Cidr, 2017 - cs.rutgers.edu
In many organizations, it is often challenging for users to find relevant data for specific tasks,
since the data is usually scattered across the enterprise and often inconsistent. In fact, data …

Enabling query processing across heterogeneous data models: A survey

R Tan, R Chirkova, V Gadepally… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
Modern applications often need to manage and analyze widely diverse datasets that span
multiple data models [1],[2],[3],[4],[5]. Warehousing the data through Extract-Transform-Load …

Sok: Cryptographically protected database search

B Fuller, M Varia, A Yerukhimovich… - … IEEE Symposium on …, 2017 - ieeexplore.ieee.org
Protected database search systems cryptographically isolate the roles of reading from,
writing to, and administering the database. This separation limits unnecessary administrator …

[HTML][HTML] A systematic overview of data federation systems

Z Gu, F Corcoglioniti, D Lanti, A Mosca, G Xiao… - Semantic …, 2024 - content.iospress.com
Data federation addresses the problem of uniformly accessing multiple, possibly
heterogeneous data sources, by mapping them into a unified schema, such as an RDF …

Check out the big brain on BRAD: simplifying cloud data processing with learned automated data meshes

T Kraska, T Li, S Madden, M Markakis, A Ngom… - Proceedings of the …, 2023 - dl.acm.org
The last decade of database research has led to the prevalence of specialized systems for
different workloads. Consequently, organizations often rely on a combination of specialized …

[PDF][PDF] Data Ingestion for the Connected World.

J Meehan, C Aslantas, S Zdonik, N Tatbul, J Du - Cidr, 2017 - people.csail.mit.edu
In this paper, we argue that in many “Big Data” applications, getting data into the system
correctly and at scale via traditional ETL (Extract, Transform, and Load) processes is a …

The BigDAWG polystore system and architecture

V Gadepally, P Chen, J Duggan… - 2016 IEEE High …, 2016 - ieeexplore.ieee.org
Organizations are often faced with the challenge of providing data management solutions for
large, heterogenous datasets that may have different underlying data and programming …

BatchDB: Efficient isolated execution of hybrid OLTP+ OLAP workloads for interactive applications

D Makreshanski, J Giceva, C Barthels… - Proceedings of the 2017 …, 2017 - dl.acm.org
In this paper we present BatchDB, an in-memory database engine designed for hybrid OLTP
and OLAP workloads. BatchDB achieves good performance, provides a high level of data …