Accelerating Deep Learning Workloads with Advanced Matrix Extensions

W Huang, D Wang, S Zhou, C Li… - 2023 5th International …, 2023 - ieeexplore.ieee.org
This paper introduces advanced matrix extensions which is a new x86 extension that is
designed to accelerate deep learning performance. It discusses the evolution of hardware …

[PDF][PDF] A Case Study on the Generative AI Project Life Cycle Using Large Language Models

A Bandi, H Kagitha - Proceedings of 39th International Confer, 2024 - easychair.org
Abstract Large Language Models represent a disruptive technology set to revolutionize the
future of artificial intelligence. While numerous literature reviews and survey articles discuss …

A Low-Latency, Area-Efficient Convolution Network for FPGA Acceleration

G Naga Swetha, AM Sandi - International Journal of Computing …, 2024 - journal.uob.edu.bh
The goal of the technology known as Field Programmable Gate Arrays (FPGA) is to improve
the safety, performance, and efficiency of cryptographic operations in contexts with limited …

Compiler-centric across-stack deep learning acceleration

P Gibson - 2023 - theses.gla.ac.uk
Optimizing the deployment of Deep Neural Networks (DNNs) is hard. Despite deep learning
approaches increasingly providing state-of-the-art solutions to a variety of difficult problems …

[PDF][PDF] Energy-Efficient Security Solutions for Next-Generation Embedded Systems

S Maji - 2023 - researchgate.net
The proliferation of embedded systems and the Internet of Things (IoT) has opened up new
possibilities for various applications. However, with these advancements come heightened …

An Efficient Hardware Accelerator for Class Incremental Deep Neural Networks

E Ressa - 2023 - webthesis.biblio.polito.it
Il machine learning (ML) e le reti neurali (NN) hanno grosse potenzialità in molti campi, tra
cui quello della classificazione di immagini, suoni, segnali, ecc. Alcuni task di …

Self-adapting reconfigurable multiply-accumulator for real-time image processing in embedded systems

A Fasolino, P Vitolo, R Liguori… - … -time Processing of …, 2024 - spiedigitallibrary.org
Multiply-Accumulate (MAC) operation is widely used in various real-time image processing
tasks, ranging from Convolutional Neural Networks to digital filtering, significantly impacting …

[PDF][PDF] Neuromorphic Photonics and hybrid Intelligence

L Pavesi - 11th International Symposium on Optics and its …, 2023 - photonicsai.com
The interest in Artificial Neural Networks (ANNs) has considerably increased in recent years
due to their versatility, which allows for dealing with a huge class of problems [1]. Nowadays …

FPGA-kiihdyttimien käyttö koneoppimisessa

V Salmi - 2023 - trepo.tuni.fi
Koneoppimisen hyödyntäminen on nykyään suosittua monessa eri sovelluskohteessa.
Erityisen suosittuja ovat neuroverkkomallit, joiden tarkkuus ja koko kasvavat vuosi vuodelta …

[PDF][PDF] Incorporating Prior Knowledge to Efficiently Design Deep Learning Accelerators

D Chiou, M Erez, A Parashar - cs.utexas.edu
The path to a PhD is mired with taxing twists and turns, and, more than the research, it was
the people around me that kept me going. I owe deep thanks to many, but in this brief …