Identifying the culprits behind network congestion

A Bhatele, AR Titus, JJ Thiagarajan… - 2015 IEEE …, 2015 - ieeexplore.ieee.org
Network congestion is one of the primary causes of performance degradation, performance
variability and poor scaling in communication-heavy parallel applications. However, the …

Network endpoint congestion control for fine-grained communication

N Jiang, L Dennison, WJ Dally - … of the International Conference for High …, 2015 - dl.acm.org
Endpoint congestion in HPC networks creates tree saturation that is detrimental to
performance. Endpoint congestion can be alleviated by reducing the injection rate of traffic …

Exploiting idle resources in a high-radix switch for supplemental storage

MA Blumrich, N Jiang… - … Conference for High …, 2018 - ieeexplore.ieee.org
A general-purpose switch for a high-performance network is usually designed with
symmetric ports providing credit-based flow control and error recovery via link-level re …

A new proposal to deal with congestion in InfiniBand-based fat-trees

J Escudero-Sahuquillo, PJ Garcia, FJ Quiles… - Journal of Parallel and …, 2014 - Elsevier
The overall performance of High-Performance Computing applications may depend largely
on the performance achieved by the network interconnecting the end-nodes; thus high …

Exploration of congestion control techniques on dragonfly-class hpc networks through simulation

N McGlohon, CD Carothers… - … and Simulation of …, 2021 - ieeexplore.ieee.org
Ensuring optimal communication latency in High Performance Computing (HPC) networks is
of critical importance to the efficient operation of facilitated applications. Different application …

Efficient and cost-effective hybrid congestion control for HPC interconnection networks

J Escudero-Sahuquillo, EG Gran… - IEEE transactions on …, 2014 - ieeexplore.ieee.org
Interconnection networks are key components in high-performance computing (HPC)
systems, their performance having a strong influence on the overall system one. However, at …

System and method for facilitating tracer packets in a data-driven intelligent network

AM Ford, TJ Johnson, AM Bataineh - US Patent 12,034,633, 2024 - Google Patents
A data-driven intelligent networking system that can facilitate tracing of data flow packets is
provided. The system add tracer packets to data flow packets arriving at an ingress point of …

Towards an efficient combination of adaptive routing and queuing schemes in Fat-Tree topologies

J Rocher-Gonzalez, J Escudero-Sahuquillo… - Journal of Parallel and …, 2021 - Elsevier
The interconnection network is a key element in High-Performance Computing (HPC) and
Datacenter (DC) systems whose performance depends on several design parameters, such …

Dragonfly routing with incomplete group connectivity

EL Froese - US Patent 11,985,060, 2024 - Google Patents
Abstract Systems and methods are provided for managing a data communication within a
multi-level network having a plurality of switches organized as groups, with each group …

FlowStar: Fast Convergence Per-Flow State Accurate Congestion Control for InfiniBand

C Luo, H Gu, L Zhu, H Zhang - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
According to the latest TOP500 list, InfiniBand (IB) is the most widely used network
architecture in the top 10 supercomputers. IB relies on Credit-based Flow Control (CBFC) to …