Wormhole routing techniques for directly connected multicomputer systems

P Mohapatra - ACM Computing Surveys (CSUR), 1998 - dl.acm.org
Wormhole routing has emerged as the most widely used switching technique in massively
parallel computers. We present a detailed survey of various techniques for enhancing the …

A scalable, commodity data center network architecture

M Al-Fares, A Loukissas, A Vahdat - ACM SIGCOMM computer …, 2008 - dl.acm.org
Today's data centers may contain tens of thousands of computers with significant aggregate
bandwidth requirements. The network architecture typically consists of a tree of routing and …

[图书][B] Principles and practices of interconnection networks

WJ Dally, BP Towles - 2004 - books.google.com
One of the greatest challenges faced by designers of digital systems is optimizing the
communication and interconnection between system components. Interconnection networks …

Cilk: An efficient multithreaded runtime system

RD Blumofe, CF Joerg, BC Kuszmaul… - ACM SigPlan …, 1995 - dl.acm.org
Cilk (pronounced “silk”) is a C-based runtime system for multi-threaded parallel
programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler …

[图书][B] Interconnection networks

J Duato, S Yalamanchili, L Ni - 2003 - books.google.com
This book, for the first time, makes the technology of interconnection networks accessible to
the engineering student and the practicing engineer. The authors are three key members of …

[图书][B] Parallel computer architecture: a hardware/software approach

D Culler, JP Singh, A Gupta - 1999 - books.google.com
The most exciting development in parallel computer architecture is the convergence of
traditionally disparate approaches on a common machine structure. This book explains the …

[图书][B] Advanced computer architecture: parallelism, scalability, programmability

K Hwang, N Jotwani - 1993 - academia.edu
Course Syllabus Course Title: Advanced Computer Architecture Page 1 Page 1 of 5
Philadelphia University Faculty of Information Technology Department of Computer Science …

Accelerating distributed reinforcement learning with in-switch computing

Y Li, IJ Liu, Y Yuan, D Chen, A Schwing… - Proceedings of the 46th …, 2019 - dl.acm.org
Reinforcement learning (RL) has attracted much attention recently, as new and emerging AI-
based applications are demanding the capabilities to intelligently react to environment …

Syncron: Efficient synchronization support for near-data-processing architectures

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

Thread scheduling for multiprogrammed multiprocessors

NS Arora, RD Blumofe, CG Plaxton - Proceedings of the tenth annual …, 1998 - dl.acm.org
We present a user-level thread scheduler for shared-memory multiprocessors, and we
analyze its performance under multiprogramming. We model multiprogramming with two …