Difei Gao 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	745	742
h 指数	15	15
i10 指数	16	16

380

190

285

20182019202020212022202320242 3 3 28 71 259 376

开放获取的出版物数量

查看全部

13 篇文章

1 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Mike Z. SHOUNational U. of Singapore; Facebook AI; Columbia University在 columbia.edu 的电子邮件经过验证
Qinghong LinNational U. of Singapore在 u.nus.edu 的电子邮件经过验证
Ruiping WangProfessor, Institute of Computing Technology, Chinese Academy of Sciences在 ict.ac.cn 的电子邮件经过验证
Xilin ChenInstitute of Computing Technology, Chinese Academy of Sciences在 ict.ac.cn 的电子邮件经过验证
Joya ChenNational University of Singapore在 u.nus.edu 的电子邮件经过验证
Shiguang ShanProfessor of Institute of Computing Technology, Chinese Academy of Sciences在 ict.ac.cn 的电子邮件经过验证
Yuxuan WangNanyang Technological University; National U. of Singapore在 ntu.edu.sg 的电子邮件经过验证
Luowei ZhouResearch Scientist, Google Deepmind在 google.com 的电子邮件经过验证
Mengmi ZhangAssistant professor and PI of Deep NeuroCognition Lab, NTU and A*STAR在 ntu.edu.sg 的电子邮件经过验证
Kenneth LiHarvard University在 g.harvard.edu 的电子邮件经过验证
Lili PanAssociate Professor, University of Electronic Science and Technology of China在 uestc.edu.cn 的电子邮件经过验证
Rui ChenUniversity of Cambridge在 cam.ac.uk 的电子邮件经过验证

关注

Difei Gao

National U. of Singapore; Institute of Computing Technology, Chinese Academy of Sciences

在 nus.edu.sg 的电子邮件经过验证

Artificial Intelligence Vision and Language


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Egocentric video-language pretraining KQ Lin, AJ Wang, M Soldan, M Wray, R Yan, EZ Xu, D Gao, R Tu, W Zhao Neural Information Processing Systems (NeurIPS) 2 (3), 2022	129	2022
Multi-modal graph neural network for joint reasoning on vision and scene text D Gao, K Li, R Wang, S Shan, X Chen IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12746 …, 2020	128	2020
Show-1: Marrying pixel and latent diffusion models for text-to-video generation DJ Zhang, JZ Wu, JW Liu, R Zhao, L Ran, Y Gu, D Gao, MZ Shou arXiv preprint arXiv:2309.15818, 2023	76	2023
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering D Gao, L Zhou, L Ji, L Zhu, Y Yang, MZ Shou IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14773 …, 2023	57	2023
UniVTG: Towards Unified Video-Language Temporal Grounding KQ Lin, P Zhang, J Chen, S Pramanick, D Gao, AJ Wang, R Yan, MZ Shou IEEE/CVF International Conference on Computer Vision (ICCV), 2023	44	2023
Assistgpt: A general multi-modal assistant that can plan, execute, inspect, and learn D Gao, L Ji, L Zhou, KQ Lin, J Chen, Z Fan, MZ Shou arXiv preprint arXiv:2306.08640, 2023	44	2023
CRIC: A vqa dataset for compositional reasoning on vision and commonsense D Gao, R Wang, S Shan, X Chen IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022	26*	2022
Env-QA: A Video Question Answering Benchmark for Comprehensive Understanding of Dynamic Environments D Gao, R Wang, Z Bai, X Chen IEEE/CVF International Conference on Computer Vision (ICCV), 1675-1685, 2021	25	2021
Weijie Kong, et al KQ Lin, AJ Wang, M Soldan, M Wray, R Yan, EZ Xu, D Gao, R Tu, W Zhao Egocentric video-language pretraining. NeurIPS 35 (7575-7586), 26, 2022	22	2022
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval Y Wang, D Gao, L Yu, W Lei, M Feiszli, MZ Shou European Conference on Computer Vision (ECCV), 2022	21	2022
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant B Wong, J Chen, Y Wu, SW Lei, D Mao, D Gao, MZ Shou European Conference on Computer Vision (ECCV), 2022	21	2022
Symbolic replay: Scene graph as prompt for continual learning on vqa task SW Lei, D Gao, JZ Wu, Y Wang, W Liu, M Zhang, MZ Shou The AAAI Conference on Artificial Intelligence (AAAI), 2023	19	2023
Cone: An efficient coarse-to-fine alignment framework for long video temporal grounding Z Hou, W Zhong, L Ji, D Gao, K Yan, WK Chan, CW Ngo, Z Shou, N Duan Annual Meeting of the Association for Computational Linguistics (ACL), 2022	19	2022
Affordance grounding from demonstration video to target image J Chen, D Gao, KQ Lin, MZ Shou IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6799-6808, 2023	17	2023
Cvpr 2023 text guided video editing competition JZ Wu, X Li, D Gao, Z Dong, J Bai, A Singh, X Xiang, Y Li, Z Huang, Y Sun, ... arXiv preprint arXiv:2310.16003, 2023	16	2023
Learning to recognize visual concepts for visual question answering with structural label space D Gao, R Wang, S Shan, X Chen IEEE Journal of Selected Topics in Signal Processing (JSTSP) 14 (3), 494-505, 2020	12	2020
GroundNLQ@ Ego4D Natural Language Queries Challenge 2023 Z Hou, L Ji, D Gao, W Zhong, K Yan, C Li, WK Chan, CW Ngo, N Duan, ... arXiv preprint arXiv:2306.15255, 2023	9	2023
Assistsr: Task-oriented video segment retrieval for personal AI assistant SW Lei, D Gao, Y Wang, D Mao, Z Liang, L Ran, MZ Shou Findings of Empirical Methods in Natural Language Processing (EMNLP), 2021	8*	2021
AssistGUI: Task-Oriented PC Graphical User Interface Automation D Gao, L Ji, Z Bai, M Ouyang, P Li, D Mao, Q Wu, W Zhang, P Wang, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	7*	2024
An efficient coarse-to-fine alignment framework@ ego4d natural language queries challenge 2022 Z Hou, W Zhong, L Ji, D Gao, K Yan, WK Chan, CW Ngo, Z Shou, N Duan arXiv preprint arXiv:2211.08776, 2022	7	2022

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用