Simplified state space layers for sequence modeling

JTH Smith, A Warrington, SW Linderman - arXiv preprint arXiv:2208.04933, 2022 - arxiv.org
Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …

[图书][B] Residue number systems: theory and implementation

AR Omondi, AB Premkumar - 2007 - books.google.com
Residue number systems (RNSs) and arithmetic are useful for several reasons. First, a great
deal of computing now takes place in embedded processors, such as those found in mobile …

[图书][B] Parallel computation: models and methods

SG Akl - 1997 - dl.acm.org
Parallel computation | Guide books skip to main content ACM Digital Library home ACM home
Google, Inc. (search) Advanced Search Browse About Sign in Register Advanced Search …

Transformer working memory enables regular language reasoning and natural language length extrapolation

TC Chi, TH Fan, AI Rudnicky, PJ Ramadge - arXiv preprint arXiv …, 2023 - arxiv.org
Unlike recurrent models, conventional wisdom has it that Transformers cannot perfectly
model regular languages. Inspired by the notion of working memory, we propose a new …

[PDF][PDF] The design and analysis of bulk-synchronous parallel algorithms

A Tiskin - 1998 - Citeseer
The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of
general-purpose parallel computing. This thesis presents a systematic approach to the …

[图书][B] Parallel system interconnections and communications

MD Grammatikakis, DF Hsu, M Kraetzl - 2018 - taylorfrancis.com
This introduction to networking large scale parallel computer systems acts as a primary
resource for a wide readership, including network systems engineers, electronics engineers …

Constructing zero-deficiency parallel prefix adder of minimum depth

H Zhu, CK Cheng, R Graham - Proceedings of the 2005 Asia and South …, 2005 - dl.acm.org
Parallel prefix adder is a general technique for speeding up binary addition. In unit delay
model, we denote the size and depth of an n-bit prefix adder C (n) as SC (n) and d C (n) …

An Improved parallel prefix algorithm on OTIS-Mesh

PK Jana, BP Sinha - Parallel Processing Letters, 2006 - World Scientific
Wang and Sahni [4] reported two parallel algorithms for N-point prefix computation on an N-
processor OTIS-Mesh optoelectronic computer. The overall time complexity for both SIMS …

On the construction of zero-deficiency parallel prefix circuits with minimum depth

H Zhu, CK Cheng, R Graham - ACM Transactions on Design Automation …, 2006 - dl.acm.org
A parallel prefix circuit has n inputs x 1, x 2,…, xn, and computes the n outputs yi= xi• xi−
1•…• x 1, 1≤ i≤ n, in parallel, where• is an arbitrary binary associative operator. Snir proved …

The strict time lower bound and optimal schedules for parallel prefix with resource constraints

H Wang, A Nicolau, KYS Siu - IEEE Transactions on Computers, 1996 - ieeexplore.ieee.org
Prefix computation is a basic operation at the core of many important applications, eg, some
of the Grand Challenge problems, circuit design, digital signal processing, graph …