Low latency geo-distributed data analytics

Q Pu, G Ananthanarayanan, P Bodik… - ACM SIGCOMM …, 2015 - dl.acm.org
Low latency analytics on geographically distributed datasets (across datacenters, edge
clusters) is an upcoming and increasingly important challenge. The dominant approach of …

{CLARINET}:{WAN-Aware} Optimization for Analytics Queries

R Viswanathan, G Ananthanarayanan… - 12th USENIX Symposium …, 2016 - usenix.org
Recent work has made the case for geo-distributed analytics, where data collected and
stored at multiple datacenters and edge sites world-wide is analyzed in situ to drive …

Wanalytics: Geo-distributed analytics for a data intensive world

A Vulimiri, C Curino, PB Godfrey, T Jungblut… - Proceedings of the …, 2015 - dl.acm.org
Many large organizations collect massive volumes of data each day in a geographically
distributed fashion, at data centers around the globe. Despite their geographically diverse …

Global analytics in the face of bandwidth and regulatory constraints

A Vulimiri, C Curino, PB Godfrey, T Jungblut… - … USENIX Symposium on …, 2015 - usenix.org
Global-scale organizations produce large volumes of data across geographically distributed
data centers. Querying and analyzing such data as a whole introduces new research issues …

Wide-area analytics with multiple resources

CC Hung, G Ananthanarayanan, L Golubchik… - Proceedings of the …, 2018 - dl.acm.org
Running data-parallel jobs across geo-distributed sites has emerged as a promising
direction due to the growing need for geo-distributed cluster deployment. A key difference …

[PDF][PDF] Volley: Automated data placement for geo-distributed cloud services

S Agarwal, J Dunagan, N Jain, S Saroiu, A Wolman… - NSDI, 2010 - usenix.org
As cloud services grow to span more and more globally distributed datacenters, there is an
increasingly urgent need for automated mechanisms to place application data across these …

Flutter: Scheduling tasks closer to data across geo-distributed datacenters

Z Hu, B Li, J Luo - IEEE INFOCOM 2016-The 35th Annual IEEE …, 2016 - ieeexplore.ieee.org
Typically called big data processing, processing large volumes of data from geographically
distributed regions with machine learning algorithms has emerged as an important …

Network-aware locality scheduling for distributed data operators in data centers

L Cheng, Y Wang, Q Liu, DHJ Epema… - … on Parallel and …, 2021 - ieeexplore.ieee.org
Large data centers are currently the mainstream infrastructures for big data processing. As
one of the most fundamental tasks in these environments, the efficient execution of …

Traffic-aware geo-distributed big data analytics with predictable job completion time

P Li, S Guo, T Miyazaki, X Liao, H Jin… - … on Parallel and …, 2016 - ieeexplore.ieee.org
Big data analytics has attracted close attention from both industry and academic because of
its great benefits in cost reduction and better decision making. As the fast growth of various …

Time-and cost-efficient task scheduling across geo-distributed data centers

Z Hu, B Li, J Luo - IEEE Transactions on Parallel and …, 2017 - ieeexplore.ieee.org
Typically called big data processing, analyzing large volumes of data from geographically
distributed regions with machine learning algorithms has emerged as an important …