Video understanding with large language models: A survey

Y Tang, J Bi, S Xu, L Song, S Liang, T Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
With the burgeoning growth of online video platforms and the escalating volume of video
content, the demand for proficient video understanding tools has intensified markedly. Given …

Merlin: Empowering multimodal llms with foresight minds

E Yu, L Zhao, Y Wei, J Yang, D Wu, L Kong… - arXiv preprint arXiv …, 2023 - arxiv.org
Humans possess the remarkable ability to foresee the future to a certain extent based on
present observations, a skill we term as foresight minds. However, this capability remains …

Towards frame rate agnostic multi-object tracking

W Feng, L Bai, Y Yao, F Yu, W Ouyang - International Journal of Computer …, 2023 - Springer
Multi-object Tracking (MOT) is one of the most fundamental computer vision tasks that
contributes to various video analysis applications. Despite the recent promising progress …

[HTML][HTML] Multiple Moving Vehicles Tracking Algorithm with Attention Mechanism and Motion Model

J Gao, G Han, H Zhu, L Liao - Electronics, 2024 - mdpi.com
With the acceleration of urbanization and the increasing demand for travel, current road
traffic is experiencing rapid growth and more complex spatio-temporal logic. Vehicle tracking …

TunnelTrack: A Dataset for Multi-Object Tracking in Tunnel Roads

J Zhuo, J Huang, L Peng, S Chen… - … on Imaging Systems …, 2023 - ieeexplore.ieee.org
Multi-object tracking is a very important field in computer vision. Multi-object tracking under
different scenes plays different roles and has different significance. As one of the special …