Moving big data to the cloud: An online cost-minimizing approach

L Zhang, C Wu, Z Li, C Guo, M Chen… - IEEE Journal on …, 2013 - ieeexplore.ieee.org
Cloud computing, rapidly emerging as a new computation paradigm, provides agile and
scalable resource access in a utility-like fashion, especially for the processing of big data. An …

A survey on geographically distributed big-data processing using MapReduce

S Dolev, P Florissi, E Gudes… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Hadoop and Spark are widely used distributed processing frameworks for large-scale data
processing in an efficient and fault-tolerant manner on private or public clouds. These big …

A survey on bandwidth-aware geo-distributed frameworks for big-data analytics

M Bergui, S Najah, NS Nikolov - Journal of Big Data, 2021 - Springer
In the era of global-scale services, organisations produce huge volumes of data, often
distributed across multiple data centres, separated by vast geographical distances. While …

Conducting repeatable experiments in highly variable cloud computing environments

A Abedi, T Brecht - Proceedings of the 8th ACM/SPEC on International …, 2017 - dl.acm.org
Previous work has shown that benchmark and application performance in public cloud
computing environments can be highly variable. Utilizing Amazon EC2 traces that include …

Network cost-aware geo-distributed data analytics system

K Oh, M Zhang, A Chandra… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Many geo-distributed data analytics (GDA) systems have focused on the network
performance-bottleneck: inter-data center network bandwidth to improve performance …

End-to-end optimization for geo-distributed mapreduce

B Heintz, A Chandra, RK Sitaraman… - IEEE Transactions on …, 2014 - ieeexplore.ieee.org
MapReduce has proven remarkably effective for a wide variety of data-intensive
applications, but it was designed to run on large single-site homogeneous clusters …

Cost-aware big data processing across geo-distributed datacenters

W Xiao, W Bao, X Zhu, L Liu - IEEE Transactions on Parallel …, 2017 - ieeexplore.ieee.org
With the globalization of service, organizations continuously produce large volumes of data
that need to be analysed over geo-dispersed locations. Traditionally central approach that …

Time and cost sensitive data-intensive computing on hybrid clouds

T Bicer, D Chiu, G Agrawal - 2012 12th IEEE/ACM International …, 2012 - ieeexplore.ieee.org
Purpose-built clusters permeate many of today's organizations, providing both large-scale
data storage and computing. Within local clusters, competition for resources complicates …

GEODIS: towards the optimization of data locality-aware job scheduling in geo-distributed data centers

MW Convolbo, J Chou, CH Hsu, YC Chung - Computing, 2018 - Springer
Today, data-intensive applications rely on geographically distributed systems to leverage
data collection, storing and processing. Data locality has been seen as a prominent …

Resilin: Elastic mapreduce over multiple clouds

A Iordache, C Morin, N Parlavantzas… - 2013 13th IEEE/ACM …, 2013 - ieeexplore.ieee.org
The MapReduce programming model offers a simple and efficient way of performing
distributed computation over large data sets. To enable the usage of MapReduce in the …