Longvlm: Efficient long video understanding via large language models

Y Weng, M Han, H He, X Chang, B Zhuang - arXiv preprint arXiv …, 2024 - arxiv.org
Empowered by Large Language Models (LLMs), recent advancements in VideoLLMs have
driven progress in various video understanding tasks. These models encode video …

Efficiently adapting large pre-trained models for real-time violence recognition in smart city surveillance

X Ren, W Fan, Y Wang - Journal of Real-Time Image Processing, 2024 - Springer
Recently, the concept of smart cities has gained prominence, aiming to enhance urban
efficiency, safety, and quality of life through advanced technologies. A critical component of …

AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models Adapting

M Yang, Y Tian, L Zhang, X Liang, X Ran… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, prompt-based methods have emerged as a new alternativeparameter-efficient fine-
tuning'paradigm, which only fine-tunes a small number of additional parameters while …