Ap-10k: A benchmark for animal pose estimation in the wild

Y Xu, J Zhang, Q Zhang, D Tao - Advances in Neural …, 2022 - proceedings.neurips.cc

Although no specific domain knowledge is considered in the design, plain vision
transformers have shown excellent performance in visual recognition tasks. However, little …

被引用次数：406 相关文章所有 5 个版本

[PDF] thecvf.com

Instructdiffusion: A generalist modeling interface for vision tasks

Z Geng, B Yang, T Hang, C Li, S Gu… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present InstructDiffusion a unified and generic framework for aligning computer vision
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …

被引用次数：36 相关文章所有 3 个版本

[PDF] arxiv.org

Vitaev2: Vision transformer advanced by exploring inductive bias for image recognition and beyond

Q Zhang, Y Xu, J Zhang, D Tao - International Journal of Computer Vision, 2023 - Springer

Vision transformers have shown great potential in various computer vision tasks owing to
their strong capability to model long-range dependency using the self-attention mechanism …

被引用次数：191 相关文章所有 7 个版本

Animal pose estimation: A closer look at the state-of-the-art, existing gaps and opportunities

L Jiang, C Lee, D Teotia, S Ostadabbas - Computer Vision and Image …, 2022 - Elsevier

Over the past few years, research on animal pose estimation in computer vision field has
grown in many aspects such as 2D and 3D pose estimation, 3D mesh reconstruction, and …

被引用次数：23 相关文章所有 2 个版本

[PDF] arxiv.org

Rtmpose: Real-time multi-person pose estimation based on mmpose

T Jiang, P Lu, L Zhang, N Ma, R Han, C Lyu… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent studies on 2D pose estimation have achieved excellent performance on public
benchmarks, yet its application in the industrial community still suffers from heavy model …

被引用次数：59 相关文章所有 2 个版本

[PDF] thecvf.com

Animal kingdom: A large and diverse dataset for animal behavior understanding

XL Ng, KE Ong, Q Zheng, Y Ni… - Proceedings of the …, 2022 - openaccess.thecvf.com

Understanding animals' behaviors is significant for a wide range of applications. However,
existing animal behavior datasets have limitations in multiple aspects, including limited …

被引用次数：54 相关文章所有 6 个版本

Human pose estimation using deep learning: review, methodologies, progress and future research directions

P Kumar, S Chauhan, LK Awasthi - International Journal of Multimedia …, 2022 - Springer

Human pose estimation (HPE) has developed over the past decade into a vibrant field for
research with a variety of real-world applications like 3D reconstruction, virtual testing and re …

被引用次数：13 相关文章所有 2 个版本

[PDF] thecvf.com

Animal3d: A comprehensive dataset of 3d animal pose and shape

J Xu, Y Zhang, J Peng, W Ma… - Proceedings of the …, 2023 - openaccess.thecvf.com

Accurately estimating the 3D pose and shape is an essential step towards understanding
animal behavior, and can potentially benefit many downstream applications, such as wildlife …

被引用次数：9 相关文章所有 7 个版本

[PDF] arxiv.org

Mmt-bench: A comprehensive multimodal benchmark for evaluating large vision-language models towards multitask agi

K Ying, F Meng, J Wang, Z Li, H Lin, Y Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Vision-Language Models (LVLMs) show significant strides in general-purpose
multimodal applications such as visual dialogue and embodied navigation. However …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Pose for everything: Towards category-agnostic pose estimation

L Xu, S Jin, W Zeng, W Liu, C Qian, W Ouyang… - European conference on …, 2022 - Springer

Existing works on 2D pose estimation mainly focus on a certain category, eg human, animal,
and vehicle. However, there are lots of application scenarios that require detecting the …

被引用次数：27 相关文章所有 5 个版本

高级搜索

QQ 群