Partitioning sparse deep neural networks for scalable training and inference

X Deng, J Li, C Ma, K Wei, L Shi… - IEEE Journal on …, 2022 - ieeexplore.ieee.org

Federated Learning (FL) empowers Industrial Internet of Things (IIoT) with distributed
intelligence of industrial automation thanks to its capability of distributed machine learning …

被引用次数：34 相关文章所有 3 个版本

[PDF] arxiv.org

Scalable graph convolutional network training on distributed-memory systems

GV Demirci, A Haldar, H Ferhatosmanoglu - arXiv preprint arXiv …, 2022 - arxiv.org

Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs.
The large data sizes of graphs and their vertex features make scalable training algorithms …

被引用次数：14 相关文章所有 10 个版本

Dynamic layer-wise sparsification for distributed deep learning

H Zhang, T Wu, Z Ma, F Li, J Liu - Future Generation Computer Systems, 2023 - Elsevier

Distributed stochastic gradient descent (SGD) algorithms are becoming popular in speeding
up deep learning model training by employing multiple computational devices (named …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural Networks

R Guliyev, A Haldar, H Ferhatosmanoglu - arXiv preprint arXiv:2409.09079, 2024 - arxiv.org

Graph Neural Network (GNN) models on streaming graphs entail algorithmic challenges to
continuously capture its dynamic state, as well as systems challenges to optimize latency …

被引用次数：1 相关文章所有 4 个版本

[PDF] mdpi.com

A lightweight self-supervised representation learning algorithm for scene classification in spaceborne SAR and optical images

X Xiao, C Li, Y Lei - Remote Sensing, 2022 - mdpi.com

Despite the increasing amount of spaceborne synthetic aperture radar (SAR) images and
optical images, only a few annotated data can be used directly for scene classification tasks …

被引用次数：6 相关文章所有 7 个版本

[PDF] arxiv.org

Self-Compressing Neural Networks

S Cséfalvay, J Imber - arXiv preprint arXiv:2301.13142, 2023 - arxiv.org

This work focuses on reducing neural network size, which is a major driver of neural network
execution time, power consumption, bandwidth, and memory footprint. A key challenge is to …

被引用次数：2 相关文章所有 3 个版本

[PDF] mdpi.com

Mapping and optimization method of SpMV on Multi-DSP accelerator

S Liu, Y Cao, S Sun - Electronics, 2022 - mdpi.com

Sparse matrix-vector multiplication (SpMV) solves the product of a sparse matrix and dense
vector, and the sparseness of a sparse matrix is often more than 90%. Usually, the sparse …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

SpComm3D: A Framework for Enabling Sparse Communication in 3D Sparse Kernels

N Abubaker, T Hoefler - arXiv preprint arXiv:2404.19638, 2024 - arxiv.org

Existing 3D algorithms for distributed-memory sparse kernels suffer from limited scalability
due to reliance on bulk sparsity-agnostic communication. While easier to use, sparsity …

FSD-Inference: Fully Serverless Distributed Inference with Scalable Cloud Communication

J Oakley, H Ferhatosmanoglu - arXiv preprint arXiv:2403.15195, 2024 - arxiv.org

Serverless computing offers attractive scalability, elasticity and cost-effectiveness. However,
constraints on memory, CPU and function runtime have hindered its adoption for data …

被引用次数：3 相关文章所有 4 个版本

[PDF] mdpi.com

Leveraging Memory Copy Overlap for Efficient Sparse Matrix-Vector Multiplication on GPUs

G Zeng, Y Zou - Electronics, 2023 - mdpi.com

Sparse matrix-vector multiplication (SpMV) is central to many scientific, engineering, and
other applications, including machine learning. Compressed Sparse Row (CSR) is a widely …

高级搜索

QQ 群