Real-time analysis and visualization of the yfcc100m dataset

陈科圻，朱志亮，邓小明，马翠霞，王宏安 - 软件学报, 2020 - jos.org.cn

目标检测一直以来都是计算机视觉领域的研究热点之一, 其任务是返回给定图像中的单个或多个
特定目标的类别与矩形包围框坐标. 随着神经网络研究的飞速进展, R-CNN 检测器的诞生标志着 …

Uni-perceiver v2: A generalist model for large-scale vision and vision-language tasks

H Li, J Zhu, X Jiang, X Zhu, H Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite the remarkable success of foundation models, their task-specific fine-tuning
paradigm makes them inconsistent with the goal of general perception modeling. The key to …

被引用次数：44 相关文章所有 7 个版本

[PDF] thecvf.com

Uni-perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks

X Zhu, J Zhu, H Li, X Wu, H Li… - Proceedings of the …, 2022 - openaccess.thecvf.com

Biological intelligence systems of animals perceive the world by integrating information in
different modalities and processing simultaneously for various tasks. In contrast, current …

被引用次数：114 相关文章所有 6 个版本

[PDF] arxiv.org

The all-seeing project: Towards panoptic visual recognition and understanding of the open world

W Wang, M Shi, Q Li, W Wang, Z Huang, L Xing… - arXiv preprint arXiv …, 2023 - arxiv.org

We present the All-Seeing (AS) project: a large-scale data and model for recognizing and
understanding everything in the open world. Using a scalable data engine that incorporates …

被引用次数：40 相关文章所有 3 个版本

[PDF] thecvf.com

Towards all-in-one pre-training via maximizing multi-modal mutual information

W Su, X Zhu, C Tao, L Lu, B Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

To effectively exploit the potential of large-scale models, various pre-training strategies
supported by massive data from different sources are proposed, including supervised pre …

被引用次数：35 相关文章所有 8 个版本

[PDF] neurips.cc

Uni-perceiver-moe: Learning sparse generalist models with conditional moes

J Zhu, X Zhu, W Wang, X Wang, H Li… - Advances in Neural …, 2022 - proceedings.neurips.cc

To build an artificial neural network like the biological intelligence system, recent works have
unified numerous tasks into a generalist model, which can process various tasks with shared …

被引用次数：47 相关文章所有 7 个版本

[PDF] neurips.cc

Contrastive language-image pre-training with knowledge graphs

X Pan, T Ye, D Han, S Song… - Advances in Neural …, 2022 - proceedings.neurips.cc

Recent years have witnessed the fast development of large-scale pre-training frameworks
that can extract multi-modal representations in a unified form and achieve promising …

被引用次数：30 相关文章所有 6 个版本

[PDF] jos.org.cn

Deep learning for multi-scale object detection: A survey

陈科圻，朱志亮，邓小明，马翠霞，王宏安 - Journal of Software, 2020 - jos.org.cn

被引用次数：28 相关文章所有 3 个版本

[PDF] thecvf.com

Dynamic zoom-in network for fast object detection in large images

M Gao, R Yu, A Li, VI Morariu… - Proceedings of the …, 2018 - openaccess.thecvf.com

We introduce a generic framework that reduces the computational cost of object detection
while retaining accuracy for scenarios where objects with varied sizes appear in high …

被引用次数：166 相关文章所有 12 个版本

[PDF] thecvf.com

Transgaga: Geometry-aware unsupervised image-to-image translation

W Wu, K Cao, C Li, C Qian… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Unsupervised image-to-image translation aims at learning a mapping between two visual
domains. However, learning a translation across large geometry variations al-ways ends up …

被引用次数：127 相关文章所有 6 个版本

高级搜索

QQ 群