Structured two-stream attention network for video question answering L Gao, P Zeng, J Song, YF Li, W Liu, T Mei, HT Shen Proceedings of the AAAI conference on artificial intelligence 33 (01), 6391-6398, 2019 | 74 | 2019 |
From pixels to objects: Cubic visual attention for visual question answering J Song, P Zeng, L Gao, HT Shen IJCAI, 2018 | 71* | 2018 |
Hierarchical representation network with auxiliary tasks for video captioning and video question answering L Gao, Y Lei, P Zeng, J Song, M Wang, HT Shen IEEE Transactions on Image Processing 31, 202-215, 2021 | 69 | 2021 |
Rich visual knowledge-based augmentation network for visual question answering L Zhang, S Liu, D Liu, P Zeng, X Li, J Song, L Gao IEEE Transactions on Neural Networks and Learning Systems 32 (10), 4362-4373, 2020 | 57 | 2020 |
S2 Transformer for Image Captioning. P Zeng, H Zhang, J Song, L Gao IJCAI, 1608-1614, 2022 | 50 | 2022 |
Conceptual and syntactical cross-modal alignment with cross-level consistency for image-text matching P Zeng, L Gao, X Lyu, S Jing, J Song Proceedings of the 29th ACM International Conference on Multimedia, 2205-2213, 2021 | 34 | 2021 |
Video question answering with prior knowledge and object-sensitive learning P Zeng, H Zhang, L Gao, J Song, HT Shen IEEE Transactions on Image Processing 31, 5936-5948, 2022 | 31 | 2022 |
Text-instance graph: Exploring the relational semantics for text-based visual question answering X Li, B Wu, J Song, L Gao, P Zeng, C Gan Pattern Recognition 124, 108455, 2022 | 30 | 2022 |
Examine before you answer: Multi-task learning with adaptive-attentions for multiple-choice VQA L Gao, P Zeng, J Song, X Liu, HT Shen Proceedings of the 26th ACM international conference on Multimedia, 1742-1750, 2018 | 28 | 2018 |
Complementarity-aware space learning for video-text retrieval J Zhu, P Zeng, L Gao, G Li, D Liao, J Song IEEE Transactions on Circuits and Systems for Video Technology 33 (8), 4362-4374, 2023 | 21 | 2023 |
Progressive tree-structured prototype network for end-to-end image captioning P Zeng, J Zhu, J Song, L Gao Proceedings of the 30th ACM International Conference on Multimedia, 5210-5218, 2022 | 21 | 2022 |
Memory-based augmentation network for video captioning S Jing, H Zhang, P Zeng, L Gao, J Song, HT Shen IEEE Transactions on Multimedia, 2023 | 16 | 2023 |
Learning visual question answering on controlled semantic noisy labels H Zhang, P Zeng, Y Hu, J Qian, J Song, L Gao Pattern Recognition 138, 109339, 2023 | 16 | 2023 |
Dynamic scene graph generation via temporal prior inference S Wang, L Gao, X Lyu, Y Guo, P Zeng, J Song Proceedings of the 30th ACM International Conference on Multimedia, 5793-5801, 2022 | 16 | 2022 |
Adaptive fine-grained predicates learning for scene graph generation X Lyu, L Gao, P Zeng, HT Shen, J Song IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023 | 15 | 2023 |
Visual commonsense-aware representation network for video captioning P Zeng, H Zhang, L Gao, X Li, J Qian, HT Shen IEEE Transactions on Neural Networks and Learning Systems, 2023 | 13 | 2023 |
Dual-branch hybrid learning network for unbiased scene graph generation C Zheng, L Gao, X Lyu, P Zeng, AE Saddik, HT Shen IEEE Transactions on Circuits and Systems for Video Technology, 2023 | 12 | 2023 |
A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval H Li, J Song, L Gao, P Zeng, H Zhang, G Li Advances in Neural Information Processing Systems, 2022 | 12 | 2022 |
Depth-aware sparse transformer for video-language learning H Zhang, L Gao, P Zeng, A Hanjalic, HT Shen Proceedings of the 31st ACM International Conference on Multimedia, 4778-4787, 2023 | 9 | 2023 |
Generalized pyramid co-attention with learnable aggregation net for video question answering L Gao, T Chen, X Li, P Zeng, L Zhao, YF Li Pattern Recognition 120, 108145, 2021 | 9 | 2021 |