Zhengyuan Yang 个人学术档案

引用次数

	总计	2019 年至今
引用	4552	4538
h 指数	28	28
i10 指数	36	36

1800

900

450

1350

201820192020202120222023202412 59 130 305 632 1771 1635

开放获取的出版物数量

查看全部

15 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Lijuan WangMicrosoft GenAI在 microsoft.com 的电子邮件经过验证
Jianfeng WangMicrosoft在 microsoft.com 的电子邮件经过验证
Zicheng LiuMicrosoft在 microsoft.com 的电子邮件经过验证
Linjie (Lindsey) LiSenior Researcher, Microsoft在 microsoft.com 的电子邮件经过验证
Jiebo LuoAlbert Arendt Hopeman Professor of Engineering, University of Rochester在 cs.rochester.edu 的电子邮件经过验证
Kevin LinMicrosoft在 microsoft.com 的电子邮件经过验证
Zhe GanResearch Scientist, Apple在 apple.com 的电子邮件经过验证
Ce LiuAI Research Scientist Director, Meta GenAI; IEEE Fellow在 meta.com 的电子邮件经过验证
Liwei WangAssistant Professor at The Chinese University of Hong Kong在 cse.cuhk.edu.hk 的电子邮件经过验证
Jinsong SuXiamen University在 xmu.edu.cn 的电子邮件经过验证
Jianwei YangPrincipal Researcher, Microsoft Research, Redmond在 microsoft.com 的电子邮件经过验证
Jiajun Deng (邓家俊)University of Adelaide, Australian Institute for Machine Learning在 adelaide.edu.au 的电子邮件经过验证
Yuncheng LiGoogle在 google.com 的电子邮件经过验证
Chenglei SiStanford University在 stanford.edu 的电子邮件经过验证
Boqing GongResearch Scientist, Google在 google.com 的电子邮件经过验证

关注

Zhengyuan Yang

Researcher, Microsoft

在 microsoft.com 的电子邮件经过验证 - 首页

Computer Vision Multimedia Vision + Language Multimodal


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Git: A generative image-to-text transformer for vision and language J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang Transactions on Machine Learning Research (TMLR), 2022	391	2022
A fast and accurate one-stage approach to visual grounding Z Yang, B Gong, L Wang, W Huang, D Yu, J Luo IEEE International Conference on Computer Vision (ICCV), 4683-4693, 2019	333	2019
An empirical study of gpt-3 for few-shot knowledge-based vqa Z Yang, Z Gan, J Wang, X Hu, Y Lu, Z Liu, L Wang Proceedings of the AAAI Conference on Artificial Intelligence 36 (3), 3081-3089, 2022	320	2022
The dawn of lmms: Preliminary explorations with gpt-4v (ision) Z Yang, L Li, K Lin, J Wang, CC Lin, Z Liu, L Wang arXiv preprint arXiv:2309.17421 9 (1), 1, 2023	290	2023
TransVG: End-to-End Visual Grounding with Transformers J Deng, Z Yang, T Chen, W Zhou, H Li IEEE International Conference on Computer Vision (ICCV), 2021	265	2021
Scaling up vision-language pre-training for image captioning X Hu, Z Gan, J Wang, Z Yang, Z Liu, Y Lu, L Wang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	234	2022
Mm-react: Prompting chatgpt for multimodal reasoning and action Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ... arXiv preprint arXiv:2303.11381, 2023	232	2023
Improving One-stage Visual Grounding by Recursive Sub-query Construction Z Yang, T Chen, L Wang, J Luo European Conference on Computer Vision (ECCV), 2020	206	2020
End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions Z Yang, Y Zhang, J Yu, J Cai, J Luo 2018 24th international conference on pattern recognition (ICPR), 2289-2294, 2018	189	2018
Action recognition with spatio–temporal visual attention on skeleton image sequences Z Yang, Y Li, J Yang, J Luo IEEE Transactions on Circuits and Systems for Video Technology 29 (8), 2405-2415, 2018	181	2018
Prompting gpt-3 to be reliable C Si, Z Gan, Z Yang, S Wang, J Wang, J Boyd-Graber, L Wang International Conference on Learning Representations (ICLR 23), 2022	175	2022
Mm-vet: Evaluating large multimodal models for integrated capabilities W Yu, Z Yang, L Li, J Wang, K Lin, Z Liu, X Wang, L Wang arXiv preprint arXiv:2308.02490, 2023	170	2023
Attentive relational networks for mapping images to scene graphs M Qi, W Li, Z Yang, Y Wang, J Luo IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3957-3966, 2019	170	2019
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption Z Yang, Y Lu, J Wang, X Yin, D Florencio, L Wang, C Zhang, L Zhang, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021	151	2021
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation Y Yin, F Meng, J Su, C Zhou, Z Yang, J Zhou, J Luo Annual Meeting of the Association for Computational Linguistics (ACL), 2020	134	2020
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang European Conference on Computer Vision (ECCV), 521--539, 2022	130*	2022
Multimodal foundation models: From specialists to general-purpose assistants C Li, Z Gan, Z Yang, J Yang, L Li, L Wang, J Gao Foundations and Trends® in Computer Graphics and Vision 16 (1-2), 1-214, 2024	97	2024
Promptcap: Prompt-guided task-aware image captioning Y Hu, H Hua, Z Yang, W Shi, NA Smith, J Luo arXiv preprint arXiv:2211.09699, 2022	85*	2022
SAT: 2D Semantics Assisted Training for 3D Visual Grounding Z Yang, S Zhang, L Wang, J Luo IEEE International Conference on Computer Vision (ICCV), 2021	80	2021
Dynamic context-guided capsule network for multimodal machine translation H Lin, F Meng, J Su, Y Yin, Z Yang, Y Ge, J Zhou, J Luo Proceedings of the 28th ACM International Conference on Multimedia, 1320-1329, 2020	76	2020

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用