Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems

F Betzel, K Khatamifard, H Suresh, DJ Lilja… - ACM Computing …, 2018 - dl.acm.org
Approximate computing has gained research attention recently as a way to increase energy
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …

Overview of the Blue Gene/L system architecture

A Gara, MA Blumrich, D Chen, GLT Chiu… - IBM Journal of …, 2005 - ieeexplore.ieee.org
The Blue Gene®/L computer is a massively parallel supercomputer based on IBM system-on-
a-chip technology. It is designed to scale to 65,536 dual-processor nodes, with a peak …

A case for random shortcut topologies for HPC interconnects

M Koibuchi, H Matsutani, H Amano, DF Hsu… - ACM Sigarch Computer …, 2012 - dl.acm.org
As the scales of parallel applications and platforms increase the negative impact of
communication latencies on performance becomes large. Fortunately, modern High …

Technology trends in large-scale high-efficiency network computing

J Su, B Zhao, Y Dai, J Cao, Z Wei, N Zhao… - Frontiers of Information …, 2022 - Springer
Network technology is the basis for large-scale high-efficiency network computing, such as
supercomputing, cloud computing, big data processing, and artificial intelligence computing …

Exploiting fine-grained data parallelism with chip multiprocessors and fast barriers

J Sampson, R Gonzalez, JF Collard… - 2006 39th Annual …, 2006 - ieeexplore.ieee.org
We examine the ability of CMPs, due to their lower on-chip communication latencies, to
exploit data parallelism at inner-loop granularities similar to that commonly targeted by …

Twisted torus topologies for enhanced interconnection networks

JM Camara, M Moreto, E Vallejo… - … on Parallel and …, 2010 - ieeexplore.ieee.org
Many current parallel computers are built around a torus interconnection network. Machines
from Cray, HP, and IBM, among others, make use of this topology. In terms of topological …

Reconfigurable computing cluster (RCC) project: Investigating the feasibility of FPGA-based petascale computing

R Sass, WV Kritikos, AG Schmidt… - 15th Annual IEEE …, 2007 - ieeexplore.ieee.org
While medium-and large-sized computing centers have increasingly relied on clusters of
commodity PC hardware to provide cost-effective capacity and capability, it is not clear that …

Using the TOP500 to trace and project technology and architecture trends

PM Kogge, TJ Dysart - Proceedings of 2011 International Conference for …, 2011 - dl.acm.org
The TOP500 is a treasure trove of information on the leading edge of high performance
computing. It was used in the 2008 DARPA Exascale technology report to isolate out the …

Layout-conscious random topologies for HPC off-chip interconnects

M Koibuchi, I Fujiwara, H Matsutani… - 2013 IEEE 19th …, 2013 - ieeexplore.ieee.org
As the scales of parallel applications and platforms increase the negative impact of
communication latencies on performance becomes large. Random network topologies can …

Low-overhead, high-speed multi-core barrier synchronization

J Sartori, R Kumar - … on High-Performance Embedded Architectures and …, 2010 - Springer
Whereas efficient barrier implementations were once a concern only in high-performance
computing, recent trends in core integration make the topic relevant even for general …