X-world: Accessibility, vision, and autonomy meet

P Xu, X Zhu, DA Clifton - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …

被引用次数：624 相关文章所有 9 个版本

[PDF] arxiv.org

Benchmark evaluations, applications, and challenges of large vision language models: A survey

Z Li, X Wu, H Du, H Nghiem, G Shi - arXiv preprint arXiv:2501.02189, 2025 - arxiv.org

Multimodal Vision Language Models (VLMs) have emerged as a transformative technology
at the intersection of computer vision and natural language processing, enabling machines …

被引用次数：2 相关文章所有 2 个版本

[PDF] thecvf.com

XVO: Generalized visual odometry via cross-modal self-training

L Lai, Z Shangguan, J Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose XVO, a semi-supervised learning method for training generalized monocular
Visual Odometry (VO) models with robust off-the-self operation across diverse datasets and …

被引用次数：22 相关文章所有 7 个版本

[PDF] thecvf.com

Selfd: self-learning large-scale driving policies from the web

J Zhang, R Zhu, E Ohn-Bar - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com

Effectively utilizing the vast amounts of ego-centric navigation data that is freely available on
the internet can advance generalized intelligent systems, ie, to robustly scale across …

被引用次数：22 相关文章所有 5 个版本

[PDF] nsf.gov

Assister: Assistive navigation via conditional instruction generation

Z Huang, Z Shangguan, J Zhang, G Bar, M Boyd… - … on Computer Vision, 2022 - Springer

We introduce a novel vision-and-language navigation (VLN) task of learning to provide real-
time guidance to a blind follower situated in complex dynamic navigation scenarios …

被引用次数：20 相关文章所有 5 个版本

[PDF] thecvf.com

Motion Diversification Networks

HJ Kim, E Ohn-Bar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Abstract We introduce Motion Diversification Networks a novel framework for learning to
generate realistic and diverse 3D human motion. Despite recent advances in deep …

被引用次数：4 相关文章

[PDF] thecvf.com

Feedback-Guided Autonomous Driving

J Zhang, Z Huang, A Ray… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

While behavior cloning has recently emerged as a highly successful paradigm for
autonomous driving humans rarely learn to perform complex tasks such as driving via …

被引用次数：9 相关文章

[PDF] pkwyx.com

高级搜索

QQ 群