关注
Haodong Duan 段浩东
Haodong Duan 段浩东
Shanghai AI Laboratory
在 ie.cuhk.edu.hk 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Revisiting skeleton-based action recognition
H Duan, Y Zhao, K Chen, D Lin, B Dai
CVPR 2022, 2021
4922021
Mmbench: Is your multi-modal model an all-around player?
Y Liu, H Duan, Y Zhang, B Li, S Zhang, W Zhao, Y Yuan, J Wang, C He, ...
arXiv preprint arXiv:2307.06281, 2023
2432023
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
MMA Contributors
GitHub repository, https://github.com/open-mmlab/mmaction2, 2020
1632020
Internlm: A multilingual language model with progressively enhanced capabilities
ILM Team
Github Repository, https://github.com/InternLM/InternLM, 2023
1312023
Omni-sourced webly-supervised learning for video recognition
H Duan, Y Zhao, Y Xiong, W Liu, D Lin
ECCV 2020, 2020
1002020
PYSKL: Towards Good Practices for Skeleton Action Recognition
H Duan, J Wang, K Chen, D Lin
ACMMM 2022, 2022
942022
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition
P Zhang, XDB Wang, Y Cao, C Xu, L Ouyang, Z Zhao, S Ding, S Zhang, ...
arXiv preprint arXiv:2309.15112, 2023
822023
Opencompass: A universal evaluation platform for foundation models
OC Contributors
GitHub repository, https://github.com/open-compass/[opencompass/VLMEvalKit], 2023
812023
SRPGAN: perceptual generative adversarial network for single image super resolution
B Wu, H Duan, Z Liu, G Sun
arXiv preprint arXiv:1712.05927, 2017
782017
InternLM-XComposer2: Mastering free-form text-image composition and comprehension in vision-language large model
X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ...
arXiv preprint arXiv:2401.16420, 2024
532024
Dg-stgcn: Dynamic spatial-temporal modeling for skeleton-based action recognition
H Duan, J Wang, K Chen, D Lin
arXiv preprint arXiv:2210.05895, 2022
312022
MMAction
Y Zhao, H Duan, Y Xiong, D Lin
Github Repository, https://github.com/open-mmlab/mmaction, 2019
282019
OCSampler: Compressing Videos to One Clip with Single-step Sampling
J Lin, H Duan, K Chen, D Lin, L Wang
CVPR 2022, 2022
272022
Internlm2 technical report
Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ...
arXiv preprint arXiv:2403.17297, 2024
232024
Journeydb: A benchmark for generative image understanding
K Sun, J Pan, Y Ge, H Li, H Duan, X Wu, R Zhang, A Zhou, Z Qin, Y Wang, ...
NeurIPS 2023 Datasets, 2024
222024
Trb: a novel triplet representation for understanding 2d human body
H Duan, KY Lin, S Jin, W Liu, C Qian, W Ouyang
ICCV 2019, 2019
192019
TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition
H Duan, N Zhao, K Chen, D Lin
CVPR 2022, 2022
182022
Self-supervised Action Representation Learning from Partial Spatio-Temporal Skeleton Sequences
Y Zhou, H Duan, A Rao, B Su, J Wang
AAAI 2023, 2023
172023
Are We on the Right Way for Evaluating Large Vision-Language Models?
L Chen, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, J Wang, Y Qiao, ...
arXiv preprint arXiv:2403.20330, 2024
122024
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd
X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ...
arXiv preprint arXiv:2404.06512, 2024
102024
系统目前无法执行此操作,请稍后再试。
文章 1–20