Video semantic segmentation via sparse temporal transformer

J Li, W Wang, J Chen, L Niu, J Si, C Qian… - Proceedings of the 29th …, 2021 - dl.acm.org
Currently, video semantic segmentation mainly faces two challenges: 1) the demand of
temporal consistency; 2) the balance between segmentation accuracy and inference …

Dynamic spatial focus for efficient compressed video action recognition

Z Zheng, L Yang, Y Wang, M Zhang… - … on Circuits and …, 2023 - ieeexplore.ieee.org
Recent years have witnessed a growing interest in compressed video action recognition due
to the rapid growth of online videos. It remarkably reduces the storage by replacing raw …

End-to-end compressed video representation learning for generic event boundary detection

C Li, X Wang, L Wen, D Hong… - Proceedings of the …, 2022 - openaccess.thecvf.com
Generic event boundary detection aims to localize the generic, taxonomy-free event
boundaries that segment videos into chunks. Existing methods typically require video frames …

Efficient semantic segmentation by altering resolutions for compressed videos

Y Hu, Y He, Y Li, J Li, Y Han… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video semantic segmentation (VSS) is a computationally expensive task due to the per-
frame prediction for videos of high frame rates. In recent work, compact models or adaptive …

CPR++: Object Localization via Single Coarse Point Supervision

X Yu, P Chen, K Wang, X Han, G Li… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Point-based object localization (POL), which pursues high-performance object sensing
under low-cost data annotation, has attracted increased attention. However, the point …

Fast human pose estimation in compressed videos

H Liu, W Liu, Z Chi, Y Wang, Y Yu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Current approaches for human pose estimation in videos can be categorized into per-frame
and warping-based methods. Both approaches have their pros and cons. For example, per …

A survey on compression domain image and video data processing and analysis techniques

Y Dong, WD Pan - Information, 2023 - mdpi.com
A tremendous amount of image and video data are being generated and shared in our daily
lives. Image and video data are typically stored and transmitted in compressed form in order …

Weakly supervised multi-class semantic video segmentation for road scenes

M Awan, J Shin - Computer Vision and Image Understanding, 2023 - Elsevier
Weakly supervised multi-class video segmentation is one of the most challenging yet least
studied research problems in computer vision. This study aims to investigate two main …

What and where: Learn to plug adapters via nas for multidomain learning

H Zhao, H Zeng, X Qin, Y Fu, H Wang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
As an important and challenging problem, multidomain learning (MDL) typically seeks a set
of effective lightweight domain-specific adapter modules plugged into a common domain …

Learning on entropy coded images with cnn

R Piau, T Maugey, A Roumy - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
We propose an empirical study to see whether learning with convolutional neural networks
(CNNs) on entropy coded data is possible. First, we define spatial and semantic closeness …