Congestion control for large-scale RDMA deployments

Y Zhu, H Eran, D Firestone, C Guo… - ACM SIGCOMM …, 2015 - dl.acm.org
Modern datacenter applications demand high throughput (40Gbps) and ultra-low latency (<
10 μs per hop) from the network, with low CPU overhead. Standard TCP/IP stacks cannot …

{FaRM}: Fast remote memory

A Dragojević, D Narayanan, M Castro… - 11th USENIX Symposium …, 2014 - usenix.org
We describe the design and implementation of FaRM, a new main memory distributed
computing platform that exploits RDMA to improve both latency and throughput by an order …

Process placement in multicore clusters: Algorithmic issues and practical techniques

E Jeannot, G Mercier, F Tessier - IEEE Transactions on Parallel …, 2013 - ieeexplore.ieee.org
Current generations of NUMA node clusters feature multicore or manycore processors.
Programming such architectures efficiently is a challenge because numerous hardware …

An overview of process mapping techniques and algorithms in high-performance computing

T Hoefler, E Jeannot, G Mercier - High Performance Computing on …, 2014 - inria.hal.science
Due to the advent of modern hardware architectures of high-performance comput-ers, the
way the parallel applications are laid out is of paramount importance for performance. This …

An overview of topology mapping algorithms and techniques in high-performance computing

T Hoefler, E Jeannot, G Mercier - High-performance computing …, 2014 - books.google.com
High-performance computing (HPC) applications are becoming increasingly demanding in
terms of computing power. Currently, this computing power can be delivered by parallel …

Quiet neighborhoods: Key to protect job performance predictability

A Jokanovic, JC Sancho, G Rodriguez… - 2015 IEEE …, 2015 - ieeexplore.ieee.org
Interference of nearby jobs has been recently identified as the dominant reason for the high
performance variability of parallel applications running on High Performance Computing …

[PDF][PDF] The MVAPICH project: Evolution and sustainability of an open source production quality MPI library for HPC

DK Panda, K Tomko, K Schulz… - … with Int'l …, 2013 - pfigshare-u-files.s3.amazonaws.com
I. OVERVIEW OF THE MVAPICH PROJECT The MVAPICH (for MPI-1) and MVAPICH2 (for
MPI-2 and MPI-3) open-source libraries [?] have been designed and developed during the …

Measuring Congestion in {High-Performance} Datacenter Interconnects

S Jha, A Patke, J Brandt, A Gentile, B Lim… - … USENIX Symposium on …, 2020 - usenix.org
While it is widely acknowledged that network congestion in High Performance Computing
(HPC) systems can significantly degrade application performance, there has been little to no …

[PDF][PDF] On mixing queries and transactions via multiversion locking

PM Bober, MJ Carey - 1991 - minds.wisconsin.edu
In this paper, we discuss a new approach to multiversion concurrency control that allows
highperformance transaction systems to support the on-line execution of long-running …

HAN: A hierarchical autotuned collective communication framework

X Luo, W Wu, G Bosilca, Y Pei, Q Cao… - 2020 IEEE …, 2020 - ieeexplore.ieee.org
High-performance computing (HPC) systems keep growing in scale and heterogeneity to
satisfy the increasing computational need, and this brings new challenges to the design of …