查看文章

academia.edu 中的 [PDF]

Memory Management Approaches in Apache Spark: A Review

作者

Maha Dessokey, Sherif M Saif, Sameh Salem, Elsayed Saad, Hesham Eldeeb

发表日期

2020/10/19

研讨会论文

International Conference on Advanced Intelligent Systems and Informatics

页码范围

394-403

出版商

Springer, Cham

简介

In the era of Big Data, processing large amounts of data through data-intensive applications, is presenting a challenge. An in-memory distributed computing system; Apache Spark is often used to speed up big data applications. It caches intermediate data into memory, so there is no need to repeat the computation or reload data from disk when reusing these data later. This mechanism of caching data in memory makes Apache Spark much faster than other systems. When the memory used for caching data is full, the cache replacement policy used by Apache Spark is the Least Recently Used (LRU), however LRU algorithm performs poorly in some workloads. This review is going to give an insight about different replacement algorithms used to address the LRU problems, categorize the different selection factors and provide a comparison between the algorithms in terms of selection factors, performance and …

引用总数

被引用次数：9

2021202220231 5 2

学术搜索中的文章

Memory management approaches in apache spark: A review

M Dessokey, SM Saif, S Salem, E Saad, H Eldeeb - … Conference on Advanced Intelligent Systems and …, 2020

被引用次数：9 相关文章所有 2 个版本