Machine learning sparse computation mechanism for arbitrary neural networks, arithmetic compute microarchitecture, and sparsity for training mechanism

E Nurvitadhi, A Bleiweiss, D Marr, E Wang… - US Patent …, 2023 - Google Patents
2018-02-12 Assigned to INTEL CORPORATION reassignment INTEL CORPORATION
ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors …

Systems and methods for exchange of data in distributed training of machine learning algorithms

A Matveev, N Shavit - US Patent 11,715,287, 2023 - Google Patents
Abstract Systems and methods may make exchanging data in a neural network (NN) during
training more efficient. Exchanging weights among a number of processors training a NN …

Barriers and synchronization for machine learning at autonomous machines

AR Appu, A Koker, J Ray, B Vembu, JC Weast… - US Patent …, 2022 - Google Patents
One or more examples include an apparatus having a hardware barrier logic to detect
thread groups relating to machine learning operations and facilitate barrier synchronization …

Acceleration techniques for graph analysis programs

BR Bebee, BB Thompson, TJ Lewis… - US Patent 10,409,560, 2019 - Google Patents
Source code of a graph analysis program expressed in a platform-independent language
which supports linear algebra primitives is obtained. An executable version of the program is …

Systems and methods for improved neural network execution

A Matveev, N Shavit - US Patent 11,449,363, 2022 - Google Patents
A method and system for computing one or more outputs of a neural network having a
plurality of layers is provided. The method and system can include determining a plurality of …

System and method for speeding up general matrix-matrix multiplication on the GPU

R Zhou - US Patent 10,073,815, 2018 - Google Patents
A method and system for performing general matrix-matrix multiplication (GEMM) operations
on a graphics processor unit (GPU) using Smart kernels. During operation, the system may …

System and method for executing convolution in a neural network

J Kopinsky - US Patent 11,544,559, 2023 - Google Patents
(57) ABSTRACT A system and method of executing a convolution layer of a neural network
may include:(a) selecting an output spatial position (OSP) of an output matrix data element …

Fusing sparse kernels to approximate a full kernel of a convolutional neural network

R Chen, Q Fan, M Pistoia, T Suzumura - US Patent 10,740,659, 2020 - Google Patents
Techniques facilitating generation of a fused kernel that can approximate a full kernel of a
convolutional neural network are provided. In one example, a computer-implemented …

Slab based memory management for machine learning training

JH Lee, YS Ki - US Patent 11,461,869, 2022 - Google Patents
A system and method for machine learning, with a processing circuit executing an operating
system, the processing circuit being connected to a first memory and to a second memory. In …

System and method for GPU maximum register count optimization applied to general matrix-matrix multiplication

R Zhou - US Patent 10,067,910, 2018 - Google Patents
A method and system performing a general matrix-matrix multiplication (GEMM) operation
using a kernel compiled with optimal maximum register count (MRC). During operation, the …