[HTML][HTML] Review of image classification algorithms based on convolutional neural networks

L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com
Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …

A survey of modern deep learning based object detection models

SSA Zaidi, MS Ansari, A Aslam, N Kanwal… - Digital Signal …, 2022 - Elsevier
Object Detection is the task of classification and localization of objects in an image or video.
It has gained prominence in recent years due to its widespread applications. This article …

YOLOv6: A single-stage object detection framework for industrial applications

C Li, L Li, H Jiang, K Weng, Y Geng, L Li, Z Ke… - arXiv preprint arXiv …, 2022 - arxiv.org
For years, the YOLO series has been the de facto industry-level standard for efficient object
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arXiv preprint arXiv …, 2022 - arxiv.org
Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arXiv preprint arXiv …, 2023 - arxiv.org
Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

Maxvit: Multi-axis vision transformer

Z Tu, H Talebi, H Zhang, F Yang, P Milanfar… - European conference on …, 2022 - Springer
Transformers have recently gained significant attention in the computer vision community.
However, the lack of scalability of self-attention mechanisms with respect to image size has …

Out-of-distribution detection with deep nearest neighbors

Y Sun, Y Ming, X Zhu, Y Li - International Conference on …, 2022 - proceedings.mlr.press
Abstract Out-of-distribution (OOD) detection is a critical task for deploying machine learning
models in the open world. Distance-based methods have demonstrated promise, where …

Yolov9: Learning what you want to learn using programmable gradient information

CY Wang, IH Yeh, HYM Liao - arXiv preprint arXiv:2402.13616, 2024 - arxiv.org
Today's deep learning methods focus on how to design the most appropriate objective
functions so that the prediction results of the model can be closest to the ground truth …

A convnet for the 2020s

Z Liu, H Mao, CY Wu, C Feichtenhofer… - Proceedings of the …, 2022 - openaccess.thecvf.com
The" Roaring 20s" of visual recognition began with the introduction of Vision Transformers
(ViTs), which quickly superseded ConvNets as the state-of-the-art image classification …

Mvitv2: Improved multiscale vision transformers for classification and detection

Y Li, CY Wu, H Fan, K Mangalam… - Proceedings of the …, 2022 - openaccess.thecvf.com
In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for
image and video classification, as well as object detection. We present an improved version …