A unified weight initialization paradigm for tensorial convolutional neural networks

Y Pan, Y Yuan, Y Yin, Z Xu, L Shang… - Advances in Neural …, 2023 - proceedings.neurips.cc

Training large models from scratch usually costs a substantial amount of resources. Towards
this problem, recent studies such as bert2BERT and LiGO have reused small pretrained …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

Dq-lore: Dual queries with low rank approximation re-ranking for in-context learning

J Xiong, Z Li, C Zheng, Z Guo, Y Yin, E Xie… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advances in natural language processing, primarily propelled by Large Language
Models (LLMs), have showcased their remarkable capabilities grounded in in-context …

被引用次数：18 相关文章所有 3 个版本

[PDF] arxiv.org

Tensor networks meet neural networks: A survey and future perspectives

M Wang, Y Pan, Z Xu, X Yang, G Li… - arXiv preprint arXiv …, 2023 - arxiv.org

Tensor networks (TNs) and neural networks (NNs) are two fundamental data modeling
approaches. TNs were introduced to solve the curse of dimensionality in large-scale tensors …

被引用次数：28 相关文章所有 3 个版本

Bayesian tensor network structure search and its application to tensor completion

J Zeng, G Zhou, Y Qiu, C Li, Q Zhao - Neural Networks, 2024 - Elsevier

Tensor network (TN) has demonstrated remarkable efficacy in the compact representation of
high-order data. In contrast to the TN methods with pre-determined structures, the recently …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Expression syntax information bottleneck for math word problems

J Xiong, C Li, M Yang, X Hu, B Hu - … of the 45th International ACM SIGIR …, 2022 - dl.acm.org

Math Word Problems (MWP) aims to automatically solve mathematical questions given in
texts. Previous studies tend to design complex models to capture additional information in …

被引用次数：8 相关文章所有 5 个版本

A Highly compressed accelerator with temporal optical flow feature fusion and tensorized LSTM for video action recognition on terminal device

P Zhen, X Yan, W Wang, H Wei… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Deep learning-based action recognition has become ubiquitous in the video analysis area;
however, large neural networks require enormous computations to achieve high …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

An effective weight initialization method for deep learning: Application to satellite image classification

W Boulila, E Alshanqiti, A Alzahem, A Koubaa… - Expert Systems with …, 2024 - Elsevier

The growing interest in satellite imagery has triggered the need for efficient mechanisms to
extract valuable information from these vast data sources, providing deeper insights. Even …

被引用次数：1 相关文章所有 3 个版本

Advocating for the Silent: Enhancing Federated Generalization for Nonparticipating Clients

Z Wu, Z Xu, D Zeng, Q Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Federated learning (FL) has surged in prominence due to its capability of collaborative
model training without direct data sharing. However, the vast disparity in local data …

[PDF] ieee.org

Low rank optimization for efficient deep learning: Making a balance between compact architecture and fast training

X Ou, Z Chen, C Zhu, Y Liu - Journal of Systems Engineering …, 2023 - ieeexplore.ieee.org

Deep neural networks (DNNs) have achieved great success in many data processing
applications. However, high computational complexity and storage cost make deep learning …

被引用次数：3 相关文章

[PDF] arxiv.org

Compute Better Spent: Replacing Dense Layers with Structured Matrices

S Qiu, A Potapczynski, M Finzi, M Goldblum… - arXiv preprint arXiv …, 2024 - arxiv.org

Dense linear layers are the dominant computational bottleneck in foundation models.
Identifying more efficient alternatives to dense matrices has enormous potential for building …

被引用次数：6 相关文章所有 3 个版本

高级搜索

QQ 群