A survey on vision transformer

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：86 相关文章所有 6 个版本

[PDF] springer.com

Large-scale multi-modal pre-trained models: A comprehensive survey

X Wang, G Chen, G Qian, P Gao, XY Wei… - Machine Intelligence …, 2023 - Springer

With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …

被引用次数：105 相关文章所有 7 个版本

[PDF] neurips.cc

Recipe for a general, powerful, scalable graph transformer

L Rampášek, M Galkin, VP Dwivedi… - Advances in …, 2022 - proceedings.neurips.cc

We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer
with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph …

被引用次数：339 相关文章所有 5 个版本

[PDF] neurips.cc

Vision gnn: An image is worth graph of nodes

K Han, Y Wang, J Guo, Y Tang… - Advances in neural …, 2022 - proceedings.neurips.cc

Network architecture plays a key role in the deep learning-based computer vision system.
The widely-used convolutional neural network and transformer treat the image as a grid or …

被引用次数：248 相关文章所有 5 个版本

[PDF] arxiv.org

Transformers in time series: A survey

Q Wen, T Zhou, C Zhang, W Chen, Z Ma, J Yan… - arXiv preprint arXiv …, 2022 - arxiv.org

Transformers have achieved superior performances in many tasks in natural language
processing and computer vision, which also triggered great interest in the time series …

被引用次数：535 相关文章所有 6 个版本

[PDF] ieee.org

Multimodal learning with transformers: A survey

P Xu, X Zhu, DA Clifton - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …

被引用次数：334 相关文章所有 11 个版本

[PDF] thecvf.com

Snr-aware low-light image enhancement

X Xu, R Wang, CW Fu, J Jia - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com

This paper presents a new solution for low-light image enhancement by collectively
exploiting Signal-to-Noise-Ratio-aware transformers and convolutional models to …

被引用次数：218 相关文章所有 4 个版本

[PDF] openreview.net

Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting

Y Zhang, J Yan - The eleventh international conference on learning …, 2022 - openreview.net

Recently many deep models have been proposed for multivariate time series (MTS)
forecasting. In particular, Transformer-based models have shown great potential because …

被引用次数：273 相关文章

An effective CNN and Transformer complementary network for medical image segmentation

F Yuan, Z Zhang, Z Fang - Pattern Recognition, 2023 - Elsevier

The Transformer network was originally proposed for natural language processing. Due to
its powerful representation ability for long-range dependency, it has been extended for …

被引用次数：156 相关文章所有 3 个版本

[PDF] mdpi.com

A survey of visual transformers

Y Liu, Y Zhang, Y Wang, F Hou, J Yuan… - … on Neural Networks …, 2023 - ieeexplore.ieee.org

Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …

被引用次数：276 相关文章所有 13 个版本

高级搜索

QQ 群

A review of deep learning techniques for speech processing