Towards an efficient use of the BLAS library for multilinear tensor contractions

Tensor networks for dimensionality reduction and large-scale optimization: Part 1 low-rank tensor decompositions

A Cichocki, N Lee, I Oseledets, AH Phan… - … and Trends® in …, 2016 - nowpublishers.com

Modern applications in engineering and data science are increasingly based on
multidimensional data of exceedingly high volume, variety, and structural richness …

被引用次数：573 相关文章所有 6 个版本

[PDF] arxiv.org

Low-rank tensor networks for dimensionality reduction and large-scale optimization problems: Perspectives and challenges part 1

A Cichocki, N Lee, IV Oseledets, AH Phan… - arXiv preprint arXiv …, 2016 - arxiv.org

Machine learning and data mining algorithms are becoming increasingly important in
analyzing large volume, multi-relational and multi--modal datasets, which are often …

被引用次数：80 相关文章所有 3 个版本

[PDF] nsf.gov

Many-body quantum chemistry on massively parallel computers

JA Calvin, C Peng, V Rishi, A Kumar… - Chemical …, 2020 - ACS Publications

The deployment of many-body quantum chemistry methods onto massively parallel high-
performance computing (HPC) platforms is reviewed. The particular focus is on highly …

被引用次数：38 相关文章所有 6 个版本

[PDF] semanticscholar.org

Design of a high-performance GEMM-like tensor–tensor multiplication

P Springer, P Bientinesi - ACM Transactions on Mathematical Software …, 2018 - dl.acm.org

We present “GEMM-like Tensor–Tensor multiplication”(GETT), a novel approach for dense
tensor contractions that mirrors the design of a high-performance general matrix–matrix …

被引用次数：118 相关文章所有 7 个版本

[PDF] arxiv.org

High-performance tensor contraction without transposition

DA Matthews - SIAM Journal on Scientific Computing, 2018 - SIAM

Tensor computations---in particular tensor contraction (TC)---are important kernels in many
scientific computing applications. Due to the fundamental similarity of TC to matrix …

被引用次数：115 相关文章所有 7 个版本

[PDF] arxiv.org

Tensor contractions with extended BLAS kernels on CPU and GPU

Y Shi, UN Niranjan, A Anandkumar… - 2016 IEEE 23rd …, 2016 - ieeexplore.ieee.org

Tensor contractions constitute a key computational ingredient of numerical multi-linear
algebra. However, as the order and dimension of tensors grow, the time and space …

被引用次数：93 相关文章所有 17 个版本

[PDF] acm.org Full View

High-performance generalized tensor operations: A compiler-oriented approach

R Gareev, T Grosser, M Kruse - ACM Transactions on Architecture and …, 2018 - dl.acm.org

The efficiency of tensor contraction is of great importance. Compilers cannot optimize it well
enough to come close to the performance of expert-tuned implementations. All existing …

被引用次数：44 相关文章所有 8 个版本

[PDF] sciencedirect.com

Optimizing sparse tensor times matrix on GPUs

Y Ma, J Li, X Wu, C Yan, J Sun, R Vuduc - Journal of Parallel and …, 2019 - Elsevier

This work optimizes tensor-times-dense matrix multiply (Ttm) for general sparse and semi-
sparse tensors on CPU and NVIDIA GPU platforms. Ttm is a computational kernel in tensor …

被引用次数：39 相关文章所有 5 个版本

[PDF] github.io

Optimizing sparse tensor times matrix on multi-core and many-core architectures

J Li, Y Ma, C Yan, R Vuduc - 2016 6th Workshop on Irregular …, 2016 - ieeexplore.ieee.org

This paper presents the optimized design and implementation of sparse tensor-times-dense
matrix multiply (SpTTM) for CPU and GPU platforms. This primitive is a critical bottleneck in …

被引用次数：48 相关文章所有 4 个版本

[PDF] arxiv.org

Efficient molecular quantum dynamics in coordinate and phase space using pruned bases

HR Larsson, B Hartke, DJ Tannor - The Journal of Chemical Physics, 2016 - pubs.aip.org

We present an efficient implementation of dynamically pruned quantum dynamics, both in
coordinate space and in phase space. We combine the ideas behind the biorthogonal von …

被引用次数：44 相关文章所有 14 个版本

高级搜索

QQ 群