E Kakoulli, H Herodotou - Proceedings of the 2017 acm international …, 2017 - dl.acm.org
The ever-growing data storage and I/O demands of modern large-scale data analytics are challenging the current distributed storage systems. A promising trend is to exploit the recent …
Data-intensive platforms such as Hadoop and Spark are routinely used to process massive amounts of data residing on distributed file systems like HDFS. Increasing memory sizes and …
Enterprises are increasingly moving their big data analytics to the cloud with the goal of reducing costs without sacrificing application performance. Cloud service providers offer …
The viability and benefits of running MapReduce over modern High Performance Computing (HPC) clusters, with high performance interconnects and parallel file systems, have attracted …
A promising trend in storage management for big data frameworks, such as Hadoop and Spark, is the emergence of heterogeneous and hybrid storage systems that employ different …
With high performance interconnects and parallel file systems, running MapReduce over modern High Performance Computing (HPC) clusters has attracted much attention due to its …
Due to their scalability and low cost, object-based storage systems are an attractive storage solution and widely deployed. To gain valuable insight from the data residing in object …
Extracting value from data stored in object stores, such as OpenStack Swift and Amazon S3, can be problematicin common scenarios where analytics frameworks and objectstores run …
Enterprise Hadoop applications now routinely comprise complex workflows that are managed by specialized workflow schedulers such as Oozie. The resources are assumed to …