[PDF][PDF] 多尺度目标检测的深度学习研究综述

陈科圻, 朱志亮, 邓小明, 马翠霞, 王宏安 - 软件学报, 2020 - jos.org.cn
目标检测一直以来都是计算机视觉领域的研究热点之一, 其任务是返回给定图像中的单个或多个
特定目标的类别与矩形包围框坐标. 随着神经网络研究的飞速进展, R-CNN 检测器的诞生标志着 …

Uni-perceiver v2: A generalist model for large-scale vision and vision-language tasks

H Li, J Zhu, X Jiang, X Zhu, H Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Despite the remarkable success of foundation models, their task-specific fine-tuning
paradigm makes them inconsistent with the goal of general perception modeling. The key to …

Uni-perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks

X Zhu, J Zhu, H Li, X Wu, H Li… - Proceedings of the …, 2022 - openaccess.thecvf.com
Biological intelligence systems of animals perceive the world by integrating information in
different modalities and processing simultaneously for various tasks. In contrast, current …

The all-seeing project: Towards panoptic visual recognition and understanding of the open world

W Wang, M Shi, Q Li, W Wang, Z Huang, L Xing… - arXiv preprint arXiv …, 2023 - arxiv.org
We present the All-Seeing (AS) project: a large-scale data and model for recognizing and
understanding everything in the open world. Using a scalable data engine that incorporates …

Towards all-in-one pre-training via maximizing multi-modal mutual information

W Su, X Zhu, C Tao, L Lu, B Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
To effectively exploit the potential of large-scale models, various pre-training strategies
supported by massive data from different sources are proposed, including supervised pre …

Uni-perceiver-moe: Learning sparse generalist models with conditional moes

J Zhu, X Zhu, W Wang, X Wang, H Li… - Advances in Neural …, 2022 - proceedings.neurips.cc
To build an artificial neural network like the biological intelligence system, recent works have
unified numerous tasks into a generalist model, which can process various tasks with shared …

Contrastive language-image pre-training with knowledge graphs

X Pan, T Ye, D Han, S Song… - Advances in Neural …, 2022 - proceedings.neurips.cc
Recent years have witnessed the fast development of large-scale pre-training frameworks
that can extract multi-modal representations in a unified form and achieve promising …

Deep learning for multi-scale object detection: A survey

陈科圻, 朱志亮, 邓小明, 马翠霞, 王宏安 - Journal of Software, 2020 - jos.org.cn
目标检测一直以来都是计算机视觉领域的研究热点之一, 其任务是返回给定图像中的单个或多个
特定目标的类别与矩形包围框坐标. 随着神经网络研究的飞速进展, R-CNN 检测器的诞生标志着 …

Dynamic zoom-in network for fast object detection in large images

M Gao, R Yu, A Li, VI Morariu… - Proceedings of the …, 2018 - openaccess.thecvf.com
We introduce a generic framework that reduces the computational cost of object detection
while retaining accuracy for scenarios where objects with varied sizes appear in high …

Transgaga: Geometry-aware unsupervised image-to-image translation

W Wu, K Cao, C Li, C Qian… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Unsupervised image-to-image translation aims at learning a mapping between two visual
domains. However, learning a translation across large geometry variations al-ways ends up …