[HTML][HTML] Optimizing matrix-matrix multiplication on intel's advanced vector extensions multicore processor

AM Hemeida, SA Hassan, S Alkhalaf… - Ain Shams Engineering …, 2020 - Elsevier
This paper is focused on Intel Advanced Vector Extension (AVX) which has been borne of
the modern developments in AMD processors and Intel itself. Said prescript processes a …

Effective implementation of matrix–vector multiplication on Intel's AVX multicore processor

SA Hassan, MMM Mahmoud, AM Hemeida… - … Languages, Systems & …, 2018 - Elsevier
Matrix–vector multiplication kernel is one of the most important and common computational
operations which form the core of varied important application areas such as scientific and …

The performance impact analysis of loop unrolling

G Velkoski, M Gusev, S Ristov - 2014 37th International …, 2014 - ieeexplore.ieee.org
Loop unrolling is a well known technique, which usually results with speedup of a program
that contains loops. The effect is obtained by reducing the operations that require counter …

Scheduling strategies for mixed workloads in multimedia information servers

G Nerjes, P Muth, M Paterakis… - … on Research Issues …, 1998 - ieeexplore.ieee.org
In contrast to pure video servers, advanced applications such as digital libraries or
teleteaching exhibit a mixed workload with massive access to conventional," discrete" data …

Optimal block size for matrix multiplication using blocking

S Ristov, M Gusev, G Velkoski - 2014 37th International …, 2014 - ieeexplore.ieee.org
Matrix multiplication is a widely used algorithm in today's computing. Speeding up the
multiplication of huge matrices is imperative for scientists and they are trying to discover the …

Cache friendly strategies to optimize matrix multiplication

M Ananth, S Vishwas, MR Anala - 2017 IEEE 7th International …, 2017 - ieeexplore.ieee.org
Matrix multiplication is an operation used in many algorithms with a plethora of applications
ranging from Image Processing, Signal Processing, to Artificial Neural Networks and Linear …

[PDF][PDF] A novel approach for efficient training of deep neural networks

DTVD Rao, KV Ramana - Indonesian Journal of Electrical …, 2018 - researchgate.net
Deep Neural Network training algorithms consumes long training time, especially when the
number of hidden layers and nodes is large. Matrix multiplication is the key operation carried …

Parallel matrix multiplication

N Tomikj, M Gusev - 2018 41st International Convention on …, 2018 - ieeexplore.ieee.org
Utilizing all CPU cores available for numerical computations is a topic of considerable
interest in HPC. This paper analyzes and compares four different parallel algorithms for …

Optimum Prefetching Patterns Searching: A Case Study of Matrix-Matrix Multiplication

V Khomongkonudom, P Chaikarn - 2022 37th International …, 2022 - ieeexplore.ieee.org
Prefetching reduces data fetch latency and augments the speed of program execution. This
paper presents an analysis model for selecting the optimum prefetching pattern for matrix …

[PDF][PDF] Técnicas de otimização loop unrolling e loop tiling em multiplicações de matrizes utilizando openmp

SA da Silva, M da Silva Serpa… - Workshop de Iniciaçao …, 2016 - wscad.sbc.org.br
Métodos numéricos sao utilizados em diversas áreas de pesquisa, como em aplicaç oes de
simulaç oes de fenômenos naturais. Dentre as operaç oes frequentemente utilizadas tem-se …