Open-vocabulary one-stage detection with hierarchical visual-language knowledge distillation Z Ma, G Luo, J Gao, L Li, Y Chen, S Wang, C Zhang, W Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 73 | 2022 |
ViLEM: Visual-Language Error Modeling for Image-Text Retrieval Y Chen*, Z Ma*, Z Zhang*, Z Qi, C Yuan, Y Shan, B Li, W Hu, X Qie, J Wu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 8 | 2023 |
Chinese Title Generation for Short Videos: Dataset, Metric and Algorithm Z Zhang*, Z Ma*, C Yuan, Y Chen, P Wang, Z Qi, C Hao, B Li, Y Shan, ... IEEE Transactions on Pattern Analysis & Machine Intelligence, 1-16, 2024 | 7* | 2024 |
Learning semantics-grounded vocabulary representation for video-text retrieval Y Shi, H Liu, H Xu, Z Ma, Q Ye, A Hu, M Yan, J Zhang, F Huang, C Yuan, ... Proceedings of the 31st ACM International Conference on Multimedia, 4460-4470, 2023 | 2 | 2023 |
Order-Prompted Tag Sequence Generation for Video Tagging Z Ma, Z Zhang, Y Chen, Z Qi, Y Luo, Z Li, C Yuan, B Li, X Qie, Y Shan, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 2 | 2023 |
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval? Y Chen*, Z Ma*, Z Zhang*, Z Qi, C Yuan, B Li, J Pu, Y Shan, X Qi, W Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 1 | 2024 |
EA-VTR: Event-Aware Video-Text Retrieval Z Ma, Z Zhang, Y Chen, Z Qi, C Yuan, B Li, Y Luo, X Li, X Qi, Y Shan, ... arXiv preprint arXiv:2407.07478, 2024 | | 2024 |