Nyströmformer: A nyström-based algorithm for approximating self-attention

F Shamshad, S Khan, SW Zamir, MH Khan… - Medical Image …, 2023 - Elsevier

Following unprecedented success on the natural language tasks, Transformers have been
successfully applied to several computer vision problems, achieving state-of-the-art results …

被引用次数：526 相关文章所有 9 个版本

[HTML] sciencedirect.com

[HTML][HTML] A survey of transformers

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier

Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

被引用次数：1082 相关文章所有 4 个版本

[PDF] neurips.cc

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc

Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

被引用次数：1069 相关文章所有 10 个版本

[PDF] arxiv.org

Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model

HK Cheng, AG Schwing - European Conference on Computer Vision, 2022 - Springer

We present XMem, a video object segmentation architecture for long videos with unified
feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video …

被引用次数：279 相关文章所有 8 个版本

[PDF] thecvf.com

Flatten transformer: Vision transformer using focused linear attention

D Han, X Pan, Y Han, S Song… - Proceedings of the …, 2023 - openaccess.thecvf.com

The quadratic computation complexity of self-attention has been a persistent challenge
when applying Transformer models to vision tasks. Linear attention, on the other hand, offers …

被引用次数：91 相关文章所有 5 个版本

[PDF] mlr.press

Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting

T Zhou, Z Ma, Q Wen, X Wang… - … on machine learning, 2022 - proceedings.mlr.press

Long-term time series forecasting is challenging since prediction accuracy tends to
decrease dramatically with the increasing horizon. Although Transformer-based methods …

被引用次数：1022 相关文章所有 4 个版本

[PDF] arxiv.org

Actionformer: Localizing moments of actions with transformers

CL Zhang, J Wu, Y Li - European Conference on Computer Vision, 2022 - Springer

Self-attention based Transformer models have demonstrated impressive results for image
classification and object detection, and more recently for video understanding. Inspired by …

被引用次数：309 相关文章所有 7 个版本

[PDF] neurips.cc

Pure transformers are powerful graph learners

J Kim, D Nguyen, S Min, S Cho… - Advances in Neural …, 2022 - proceedings.neurips.cc

We show that standard Transformers without graph-specific modifications can lead to
promising results in graph learning both in theory and practice. Given a graph, we simply …

被引用次数：150 相关文章所有 9 个版本

[PDF] arxiv.org

Simplified state space layers for sequence modeling

JTH Smith, A Warrington, SW Linderman - arXiv preprint arXiv:2208.04933, 2022 - arxiv.org

Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …

被引用次数：258 相关文章所有 3 个版本

[PDF] openreview.net

Perceiver io: A general architecture for structured inputs & outputs

A Jaegle, S Borgeaud, JB Alayrac, C Doersch… - arXiv preprint arXiv …, 2021 - arxiv.org

A central goal of machine learning is the development of systems that can solve many
problems in as many data domains as possible. Current architectures, however, cannot be …

被引用次数：528 相关文章所有 4 个版本

高级搜索

QQ 群