Systems and methods for vectorized FFT for multi-dimensional convolution operations

M Nekuii - US Patent 10,796,220, 2020 - Google Patents
A new approach is proposed to support efficient convolution for deep learning by vectorizing
multi-dimensional input data for multi-dimensional fast Fourier transform (FFT) and direct …

Systems and methods for deep learning processor

R Goyal, K Bullis, SL Billa, A Dikshit - US Patent 11,055,063, 2021 - Google Patents
A hardware-based programmable deep learning processor (DLP) is proposed, wherein the
DLP comprises with a plurality of accelerators dedicated for deep learning processing …

Master transform architecture for deep learning

S Dwivedi, NA Haemel - US Patent App. 16/719,883, 2021 - Google Patents
Apparatuses, systems, and techniques to transform input data for training neural networks. In
at least one embodiment, one or more data transforms are identified in a sequence of data …

Schedule-aware tensor distribution module

G Chinya, H Liu, A Raha, D Mohapatra, C Brick… - US Patent …, 2024 - Google Patents
Methods and systems include a neural network system that includes a neural network
accelerator. The neural network accelerator includes multiple processing engines coupled …

Hardware Implementation of a Deep Neural Network with Variable Output Data Format

C Martin, D Hough, P Brasnett, C Dikici… - US Patent App. 16 …, 2019 - Google Patents
Hardware implementations of DNNs and related methods with a variable output data format.
Specifically, in the hardware implementations and methods described herein the hardware …

Memory efficient scalable deep learning with model parallelization

R Min, H Wang, A Kadav - US Patent 10,474,951, 2019 - Google Patents
Methods and systems for training a neural network include sampling multiple local sub-
networks from a global neural network. The local sub-networks include a subset of neurons …

Deep Learning Apparatus and Method for Predictive Analysis, Classification, and Feature Detection

TP Kenney, NJ Kenney - US Patent App. 15/924,845, 2019 - Google Patents
A deep learning computer apparatus and corresponding methods render multiple disparate
data type items, along with corresponding feature data, into a single encapsulated file format …

Systems and methods for neural network convolutional layer matrix multiplication using cache memory

R Gelashvili - US Patent App. 17/271,326, 2021 - Google Patents
A computer processor may include a number of cores, a shared cache shared among the
cores, and a local cache associated with each core and used by that core only. Input data for …

Coordinated heterogeneous processing of training data for deep neural networks

F Chang, D Liu, T Woo - US Patent 11,275,991, 2022 - Google Patents
Abstract Systems and methods for training neural networks. One embodiment is a system
that includes a memory configured to store samples of training data for a Deep Neural …

Scaled compute fabric for accelerated deep learning

GR Lauterbach, S Lie, M Morrison, ME James… - US Patent …, 2022 - Google Patents
Techniques in advanced deep learning provide improvements in one or more of accuracy,
performance, energy efficiency, and cost. In a first embodiment, a scaled array of processing …