Advancing DSP into HPC, AI, and beyond: challenges, mechanisms, and future directions

Y Wang, C Li, C Liu, S Liu, Y Lei, J Zhang… - CCF Transactions on …, 2021 - Springer
Abstract Digital Signal Processors (DSPs) have been widely used in embedded domains,
delivering high performance with ultra-low power consumption. Such promises make it …

WiBench: An open source kernel suite for benchmarking wireless systems

Q Zheng, Y Chen, R Dreslinski… - 2013 IEEE …, 2013 - ieeexplore.ieee.org
The rapid growth in the number of mobile devices and the higher data rate requirements of
mobile subscribers have made wireless signal processing a key driving application of …

Bax: A bundle adjustment accelerator with decoupled access/execute architecture for visual odometry

R Sun, P Liu, J Xue, S Yang, J Qian, R Ying - IEEE Access, 2020 - ieeexplore.ieee.org
As the demand for embedded-vision grows, solving large optimization problems in real-time
with energy and cost budget is a challenge. We present BAX, a hardware accelerator of …

A low-energy wide SIMD architecture with explicit datapath

L Waeijen, D She, H Corporaal, Y He - Journal of Signal Processing …, 2015 - Springer
Energy efficiency has become one of the most important topics in computing. To meet the
ever increasing demands of the mobile market, the next generation of processors will have …

LUAEMA: A Loop Unrolling Approach Extending Memory Accessing for Vector Very-Long-Instruction-Word Digital Signal Processor with Multiple Register Files

Y Hu, A Cheng, Z Tang, P Liu, W Liang - Electronics, 2024 - mdpi.com
Loop unrolling can provide more instruction-level parallelism opportunities for code and
enables a greater range of instruction pipeline scheduling. In high-performance very-long …

From Xetal-II to Xetal-Pro: On the road toward an ultralow-energy and high-throughput SIMD processor

Y Pu, Y He, Z Ye, SM Londono, AA Abbo… - … on Circuits and …, 2011 - ieeexplore.ieee.org
Looking forward to the next generation of mobile streaming computing, the demanded
energy efficiency of end-user terminals will become ever stringent. The Xetal-Pro processor …

A multiple SIMD, multiple data (MSMD) architecture: Parallel execution of dynamic and static SIMD fragments

Y Wang, S Chen, J Wan, J Meng… - 2013 IEEE 19th …, 2013 - ieeexplore.ieee.org
The efficacy of widely used single instruction, multiple data architectures is often limited
when handling divergent control flows and short vectors; both circumstances result in SIMD …

Lordcore: Energy-efficient opencl-programmable software-defined radio coprocessor

H Kultala, T Viitanen, H Berg… - … Transactions on Very …, 2019 - ieeexplore.ieee.org
This paper proposes a single instruction multiple data (SIMD) processor, which is
programmed with high-level OpenCL language. The low-power processor is customized for …

SIMD made explicit

L Waeijen, D She, H Corporaal… - … Conference on Embedded …, 2013 - ieeexplore.ieee.org
Low energy consumption has become one of the most important topics in computing. With
single CPUs consuming as much as 115 Watt, engineers have been looking for ways to …

MT-DMA: A DMA controller supporting efficient matrix transposition for digital signal processing

S Ma, Y Lei, L Huang, Z Wang - IEEE Access, 2018 - ieeexplore.ieee.org
Matrix transposition plays a critical role in digital signal processing. However, the existing
matrix transposition implementations have significant limitations. A traditional design uses …