The landscape of gpu-centric communication

D Unat, I Turimbetov, MKT Issa, D Sağbili… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, GPUs have become the preferred accelerators for HPC and ML applications
due to their parallelism and fast memory bandwidth. While GPUs boost computation, inter …

Boosting Data Center Performance via Intelligently Managed Multi-backend Disaggregated Memory

J Wang, H Yang, C Li, Y Zhuansun… - … Conference for High …, 2024 - ieeexplore.ieee.org
Existing disaggregated memory (DM) systems face a problem of underutilized far memory
bandwidth, which greatly limits the data throughput when processing data-intensive …

Bandwidth-Effective DRAM Cache for GPU s with Storage-Class Memory

J Hong, S Cho, G Park, W Yang… - … Symposium on High …, 2024 - ieeexplore.ieee.org
We propose overcoming the memory capacity limitation of GPUs with high-capacity Storage-
Class Memory (SCM) and DRAM cache. By significantly increasing the memory capacity …

GRIT: Enhancing Multi-GPU Performance with Fine-Grained Dynamic Page Placement

Y Wang, B Li, A Jaleel, J Yang… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Multi-GPU systems have become popular to cater to the growing demands for high
parallelism and large memory capacity. However, the delivered performance is constrained …

Excavating the potential of graph workload on rdma-based far memory architecture

J Wang, C Li, T Wang, L Zhang, P Wang… - 2022 IEEE …, 2022 - ieeexplore.ieee.org
Disaggregated architecture brings new opportunities to memory-consuming applications like
graph processing. It allows one to outspread memory access pressure from local to far …

Fargraph+: Excavating the parallelism of graph processing workload on RDMA-based far memory system

J Wang, C Li, Y Liu, T Wang, J Mei, L Zhang… - Journal of Parallel and …, 2023 - Elsevier
Disaggregated architecture brings new opportunities to memory-consuming applications like
graph processing. It allows one to outspread memory access pressure from local to far …

Fine-grain Quantitative Analysis of Demand Paging in Unified Virtual Memory

T Allen, B Cooper, R Ge - ACM Transactions on Architecture and Code …, 2024 - dl.acm.org
The abstraction of a shared memory space over separate CPU and GPU memory domains
has eased the burden of portability for many HPC codebases. However, users pay for ease …

GPUVM: GPU-driven Unified Virtual Memory

N Nazaraliyev, E Sadredini… - arXiv preprint arXiv …, 2024 - arxiv.org
Graphics Processing Units (GPUs) leverage massive parallelism and large memory
bandwidth to support high-performance computing applications, such as multimedia …

HyFarM: Task Orchestration on Hybrid Far Memory for High Performance Per Bit

J Wang, C Li, J Mei, H He, T Wang… - 2022 IEEE 40th …, 2022 - ieeexplore.ieee.org
Tapping into secondary memory resources, ie, far memory (FM), has shown huge potential
to improve the cost-efficiency of data centers. Recent advances in both storage-based …

Early-Adaptor: An Adaptive Framework forProactive UVM Memory Management

S Go, H Lee, J Kim, J Lee, MK Yoon… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Unified Virtual Memory (UVM) relieves programmers of the burden of memory management
between CPU and GPUs. However, the use of UVM can lead to performance degradation …