BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers Z Li, W Wang, H Li, E Xie, C Sima, T Lu, Q Yu, J Dai European Conference on Computer Vision (ECCV), 2022 | 874 | 2022 |
Internimage: Exploring large-scale vision foundation models with deformable convolutions W Wang, J Dai, Z Chen, Z Huang, Z Li, X Zhu, X Hu, T Lu, L Lu, H Li, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023 | 512 | 2023 |
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers Z Li, W Wang, E Xie, Z Yu, A Anandkumar, JM Alvarez, T Lu, P Luo IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 | 147* | 2022 |
Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe H Li, C Sima, J Dai, W Wang, L Lu, H Wang, J Zeng, Z Li, J Yang, H Deng, ... IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023 | 88 | 2023 |
Fb-occ: 3d occupancy prediction based on forward-backward view transformation Z Li, Z Yu, D Austin, M Fang, S Lan, J Kautz, JM Alvarez arXiv preprint arXiv:2307.01492, 2023 | 42 | 2023 |
FB-BEV: BEV Representation from Forward-Backward View Transformations Z Li, Z Yu, W Wang, A Anandkumar, T Lu, JM Alvarez ICCV 2023, 2023 | 40 | 2023 |
Drivemlm: Aligning multi-modal large language models with behavioral planning states for autonomous driving W Wang, J Xie, CY Hu, H Zou, J Fan, W Tong, Y Wen, S Wu, H Deng, Z Li, ... arXiv preprint arXiv:2312.09245, 2023 | 34 | 2023 |
Video mamba suite: State space model as a versatile alternative for video understanding G Chen, Y Huang, J Xu, B Pei, Z Chen, Z Li, J Wang, K Li, T Lu, L Wang arXiv preprint arXiv:2403.09626, 2024 | 29 | 2024 |
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications Y Xiong, Z Li, Y Chen, F Wang, X Zhu, J Luo, W Wang, T Lu, H Li, Y Qiao, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024 | 14 | 2024 |
Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving? Z Li, Z Yu, S Lan, J Li, J Kautz, T Lu, JM Alvarez IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023 | 14 | 2023 |
Leveraging vision-centric multi-modal expertise for 3d object detection L Huang, Z Li, C Sima, W Wang, J Wang, Y Qiao, H Li Advances in Neural Information Processing Systems 36, 2024 | 7 | 2024 |
An introduction of mini-alphastar RZ Liu, W Wang, Y Shen, Z Li, Y Yu, T Lu arXiv preprint arXiv:2104.06890, 2021 | 5 | 2021 |
Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation Z Li, K Li, S Wang, S Lan, Z Yu, Y Ji, Z Li, Z Zhu, J Kautz, Z Wu, YG Jiang, ... arXiv preprint arXiv:2406.06978, 2024 | 1 | 2024 |