Data placement and replica selection for improving co-location in distributed environments

KA Kumar, A Deshpande, S Khuller - arXiv preprint arXiv:1302.4168, 2013 - arxiv.org
Increasing need for large-scale data analytics in a number of application domains has led to
a dramatic rise in the number of distributed data management systems, both parallel …

Parallelism-optimizing data placement for faster data-parallel computations

N Baruah, P Kraft, F Kazhamiaka, P Bailis… - Proceedings of the …, 2022 - dl.acm.org
Systems performing large data-parallel computations, including online analytical processing
(OLAP) systems like Druid and search engines like Elasticsearch, are increasingly being …

SWORD: workload-aware data placement and replica selection for cloud data management systems

KA Kumar, A Quamar, A Deshpande, S Khuller - The VLDB Journal, 2014 - Springer
Cloud computing is increasingly being seen as a way to reduce infrastructure costs and add
elasticity, and is being used by a wide range of organizations. Cloud data management …

Query centric partitioning and allocation for partially replicated database systems

T Rabl, HA Jacobsen - Proceedings of the 2017 ACM International …, 2017 - dl.acm.org
A key feature of database systems is to provide transparent access to stored data. In
distributed database systems, this includes data allocation and fragmentation. Transparent …

Resource bricolage and resource selection for parallel database systems

J Li, JF Naughton, RV Nehme - The VLDB Journal, 2017 - Springer
Running parallel database systems in an environment with heterogeneous resources has
become increasingly common, due to cluster evolution and increasing interest in moving …

Accordion: Elastic scalability for database systems supporting distributed transactions

M Serafini, E Mansour, A Aboulnaga, K Salem… - Proceedings of the …, 2014 - dl.acm.org
Providing the ability to elastically use more or fewer servers on demand (scale out and scale
in) as the load varies is essential for database management systems (DBMSes) deployed on …

Distributed data placement to minimize communication costs via graph partitioning

L Golab, M Hadjieleftheriou, H Karloff… - Proceedings of the 26th …, 2014 - dl.acm.org
With the widespread use of shared-nothing clusters of servers, there has been a proliferation
of distributed object stores that offer high availability, reliability and enhanced performance …

Resource bricolage for parallel database systems

J Li, J Naughton, RV Nehme - Proceedings of the VLDB Endowment, 2014 - dl.acm.org
Running parallel database systems in an environment with heterogeneous resources has
become increasingly common, due to cluster evolution and increasing interest in moving …

Automatic contention detection and amelioration for data-intensive operations

J Cieslewicz, KA Ross, K Satsumi, Y Ye - Proceedings of the 2010 ACM …, 2010 - dl.acm.org
To take full advantage of the parallelism offered by a multi-core machine, one must write
parallel code. Writing parallel code is difficult. Even when one writes correct code, there are …

Dependency-aware data locality for MapReduce

X Ma, X Fan, J Liu, D Li - IEEE Transactions on Cloud …, 2015 - ieeexplore.ieee.org
MapReduce effectively partitions and distributes computation workloads to a cluster of
servers, facilitating today's big data processing. Given the massive data to be dispatched …