Conversational agents in therapeutic interventions for neurodevelopmental disorders: a survey

F Catania, M Spitale, F Garzotto - ACM Computing Surveys, 2023 - dl.acm.org
Neurodevelopmental Disorders (NDD) are a group of conditions with onset in the
developmental period characterized by deficits in the cognitive and social areas …

Advancing transformer architecture in long-context large language models: A comprehensive survey

Y Huang, J Xu, J Lai, Z Jiang, T Chen, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org
With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …

Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting

T Zhou, Z Ma, Q Wen, X Wang… - … on machine learning, 2022 - proceedings.mlr.press
Long-term time series forecasting is challenging since prediction accuracy tends to
decrease dramatically with the increasing horizon. Although Transformer-based methods …

Separable self-attention for mobile vision transformers

S Mehta, M Rastegari - arXiv preprint arXiv:2206.02680, 2022 - arxiv.org
Mobile vision transformers (MobileViT) can achieve state-of-the-art performance across
several mobile vision tasks, including classification and detection. Though these models …

Fnet: Mixing tokens with fourier transforms

J Lee-Thorp, J Ainslie, I Eckstein, S Ontanon - arXiv preprint arXiv …, 2021 - arxiv.org
We show that Transformer encoder architectures can be sped up, with limited accuracy
costs, by replacing the self-attention sublayers with simple linear transformations that" mix" …

Diagonal state spaces are as effective as structured state spaces

A Gupta, A Gu, J Berant - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Modeling long range dependencies in sequential data is a fundamental step towards
attaining human-level performance in many modalities such as text, vision, audio and video …

Informer: Beyond efficient transformer for long sequence time-series forecasting

H Zhou, S Zhang, J Peng, S Zhang, J Li… - Proceedings of the …, 2021 - ojs.aaai.org
Many real-world applications require the prediction of long sequence time-series, such as
electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a …

Deformable detr: Deformable transformers for end-to-end object detection

X Zhu, W Su, L Lu, B Li, X Wang, J Dai - arXiv preprint arXiv:2010.04159, 2020 - arxiv.org
DETR has been recently proposed to eliminate the need for many hand-designed
components in object detection while demonstrating good performance. However, it suffers …

Multi-scale vision longformer: A new vision transformer for high-resolution image encoding

P Zhang, X Dai, J Yang, B Xiao… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper presents a new Vision Transformer (ViT) architecture Multi-Scale Vision
Longformer, which significantly enhances the ViT of [??] for encoding high-resolution …

Big bird: Transformers for longer sequences

M Zaheer, G Guruganesh, KA Dubey… - Advances in neural …, 2020 - proceedings.neurips.cc
Transformers-based models, such as BERT, have been one of the most successful deep
learning models for NLP. Unfortunately, one of their core limitations is the quadratic …