Monarch mixer: A simple sub-quadratic gemm-based architecture

D Fu, S Arora, J Grogan, I Johnson… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Machine learning models are increasingly being scaled in both sequence length
and model dimension to reach longer contexts and better performance. However, existing …

视觉Transformer 研究的关键问题: 现状及展望

田永林, 王雨桐, 王建功, 王晓, 王飞跃 - 自动化学报, 2022 - aas.net.cn
Transformer 所具备的长距离建模能力和并行计算能力使其在自然语言处理领域取得了巨大
成功并逐步拓展至计算机视觉等领域. 本文以分类任务为切入, 介绍了典型视觉Transformer …

Gradient remedy for multi-task learning in end-to-end noise-robust speech recognition

Y Hu, C Chen, R Li, Q Zhu… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Speech enhancement (SE) is proved effective in reducing noise from noisy speech signals
for downstream automatic speech recognition (ASR), where multi-task learning strategy is …

Transformer with double enhancement for low-dose CT denoising

H Li, X Yang, S Yang, D Wang… - IEEE journal of …, 2022 - ieeexplore.ieee.org
Increasingly serious health problems have made the usage of computed tomography surge.
Therefore, algorithms for processing CT images are becoming more and more abundant …

YOLOv5s-Fog: an improved model based on YOLOv5s for object detection in foggy weather scenarios

X Meng, Y Liu, L Fan, J Fan - Sensors, 2023 - mdpi.com
In foggy weather scenarios, the scattering and absorption of light by water droplets and
particulate matter cause object features in images to become blurred or lost, presenting a …

面向图像分类的视觉Transformer 研究进展.

彭斌, 白静, 李文静, 郑虎… - Journal of Frontiers of …, 2024 - search.ebscohost.com
Transformer 是一种基于自注意力机制的深度学习模型, 在计算机视觉中展现出巨大的潜力.
而在图像分类任务中, 关键的挑战是高效而准确地捕捉输入图片的局部和全局特征 …

Learning shape-biased representations for infrared small target detection

F Lin, S Ge, K Bao, C Yan… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Typically, infrared small target detection aims to accurately localize objects from complex
backgrounds where the object textures are often dim and the object shapes are varying. A …

COVID-19 diagnosis via chest X-ray image classification based on multiscale class residual attention

S Liu, T Cai, X Tang, Y Zhang, C Wang - Computers in Biology and …, 2022 - Elsevier
Aiming at detecting COVID-19 effectively, a multiscale class residual attention (MCRA)
network is proposed via chest X-ray (CXR) image classification. First, to overcome the data …

A Review of the Evaluation System for Curriculum Learning

F Liu, T Zhang, C Zhang, L Liu, L Wang, B Liu - Electronics, 2023 - mdpi.com
In recent years, deep learning models have been more and more widely used in various
fields and have become a research hotspot for various tasks in artificial intelligence, but …

Bytes are all you need: Transformers operating directly on file bytes

M Horton, S Mehta, A Farhadi, M Rastegari - arXiv preprint arXiv …, 2023 - arxiv.org
Modern deep learning approaches usually transform inputs into a modality-specific form. For
example, the most common deep learning approach to image classification involves …