{MixApart}: Decoupled Analytics for Shared Storage Systems

KR Krish, A Anwar, AR Butt - 2014 14th IEEE/ACM …, 2014 - ieeexplore.ieee.org

Hadoop has become the de-facto large-scale data processing framework for modern
analytics applications. A major obstacle for sustaining high performance and scalability in …

被引用次数：97 相关文章所有 11 个版本

[PDF] cut.ac.cy

OctopusFS: A distributed file system with tiered storage management

E Kakoulli, H Herodotou - Proceedings of the 2017 acm international …, 2017 - dl.acm.org

The ever-growing data storage and I/O demands of modern large-scale data analytics are
challenging the current distributed storage systems. A promising trend is to exploit the recent …

被引用次数：69 相关文章所有 7 个版本

[PDF] arxiv.org

Automating distributed tiered storage management in cluster computing

H Herodotou, E Kakoulli - arXiv preprint arXiv:1907.02394, 2019 - arxiv.org

Data-intensive platforms such as Hadoop and Spark are routinely used to process massive
amounts of data residing on distributed file systems like HDFS. Increasing memory sizes and …

被引用次数：32 相关文章所有 13 个版本

[PDF] psu.edu

Cast: Tiering storage for data analytics in the cloud

Y Cheng, MS Iqbal, A Gupta, AR Butt - Proceedings of the 24th …, 2015 - dl.acm.org

Enterprises are increasingly moving their big data analytics to the cloud with the goal of
reducing costs without sacrificing application performance. Cloud service providers offer …

被引用次数：59 相关文章所有 11 个版本

High-performance design of YARN MapReduce on modern HPC clusters with Lustre and RDMA

M Wasi-ur-Rahman, X Lu, NS Islam… - 2015 IEEE …, 2015 - ieeexplore.ieee.org

The viability and benefits of running MapReduce over modern High Performance Computing
(HPC) clusters, with high performance interconnects and parallel file systems, have attracted …

被引用次数：60 相关文章所有 4 个版本

On efficient hierarchical storage for big data processing

KR Krish, B Wadhwa, MS Iqbal… - 2016 16th IEEE/ACM …, 2016 - ieeexplore.ieee.org

A promising trend in storage management for big data frameworks, such as Hadoop and
Spark, is the emergence of heterogeneous and hybrid storage systems that employ different …

被引用次数：36 相关文章所有 4 个版本

A comprehensive study of MapReduce over lustre for intermediate data placement and shuffle strategies on HPC clusters

MD Wasi-ur-Rahman, NS Islam, X Lu… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org

With high performance interconnects and parallel file systems, running MapReduce over
modern High Performance Computing (HPC) clusters has attracted much attention due to its …

被引用次数：30 相关文章所有 3 个版本

[PDF] ic.ac.uk

SwiftAnalytics: Optimizing object storage for big data analytics

L Rupprecht, R Zhang, B Owen… - 2017 IEEE …, 2017 - ieeexplore.ieee.org

Due to their scalability and low cost, object-based storage systems are an attractive storage
solution and widely deployed. To gain valuable insight from the data residing in object …

被引用次数：24 相关文章所有 5 个版本

[PDF] researchgate.net

Too big to eat: Boosting analytics data ingestion from object stores with scoop

Y Moatti, E Rom, R Gracia-Tinedo… - 2017 IEEE 33rd …, 2017 - ieeexplore.ieee.org

Extracting value from data stored in object stores, such as OpenStack Swift and Amazon S3,
can be problematicin common scenarios where analytics frameworks and objectstores run …

被引用次数：24 相关文章所有 6 个版本

[PDF] psu.edu

[phi] sched: A heterogeneity-aware hadoop workflow scheduler

KR Krish, A Anwar, AR Butt - 2014 IEEE 22nd International …, 2014 - ieeexplore.ieee.org

Enterprise Hadoop applications now routinely comprise complex workflows that are
managed by specialized workflow schedulers such as Oozie. The resources are assumed to …

被引用次数：38 相关文章所有 9 个版本

高级搜索

QQ 群