New Systolic Array Algorithms and VLSI Architectures for 1-D MDST

DF Chiper, A Cracan - Sensors, 2023 - mdpi.com
In this paper, we present two systolic array algorithms for efficient Very-Large-Scale
Integration (VLSI) implementations of the 1-D Modified Discrete Sine Transform (MDST) …

MESC: Re-thinking Algorithmic Priority and/or Criticality Inversions for Heterogeneous MCSs

J Guan, R Wei, D You, Y Wang, R Yang… - 2024 IEEE Real …, 2024 - ieeexplore.ieee.org
Modern Mixed-Criticality Systems (MCSs) rely on hardware heterogeneity to satisfy ever-
increasing computational demands. However, most of the heterogeneous co-processors are …

Enhanced Accelerator Design for Efficient CNN Processing with Improved Row-Stationary Dataflow

F Lesniak, A Gutermann, T Harbaum… - Proceedings of the Great …, 2024 - dl.acm.org
Efficient on-device inference of convolutional neural networks (CNNs) is becoming one of
the key challenges for embedded systems, leading to the integration of specialized …

A Scalable Hardware Architecture for Efficient Learning of Recurrent Neural Networks at the Edge

Y Zhang, MD Gomony, H Corporaal… - 2024 IFIP/IEEE 32nd …, 2024 - ieeexplore.ieee.org
Edge devices can execute pre-trained Artificial Intelligence (AI) models optimized on large
Graphical Processing Units (GPU) but often need fine-tuning for real-world data. This …

The Efficiency of Convolution on Gemmini Deep Learning Hardware Accelerator

DAN Gookyi, M Wilson, RK Ahiadormey… - 2023 IEEE …, 2023 - ieeexplore.ieee.org
The successful use of deep learning (DL) algorithms in a variety of applications is
conceptually based on convolutions. Though convolution is a simple operation, it suffers …

CGRA-RISC: Simulation Infrastructure for Coupling CGRA Accelerator to RISC-V Processor

AFR Ribeiro - 2024 - search.proquest.com
The downtrend of Koomey's law, which accounts for the doubling of performance per joule
every 1.5 years, coupled with an emergent demand for high-performance and multi-domain …