Contrastive masked autoencoders for self-supervised video hashing

Y Wang, J Wang, B Chen, Z Zeng, ST Xia - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Abstract Self-Supervised Video Hashing (SSVH) models learn to generate short binary
representations for videos without ground-truth supervision, facilitating large-scale video …

GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval

Y Wang, J Wang, B Chen, Z Zeng, ST Xia - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Given a text query, partially relevant video retrieval (PRVR) seeks to find untrimmed videos
containing pertinent moments in a database. For PRVR, clip modeling is essential to capture …

[PDF][PDF] Hugs Are Better Than Handshakes: Unsupervised Cross-Modal Transformer Hashing with Multi-granularity Alignment.

J Wang, Z Zeng, B Chen, Y Wang, D Liao, G Li… - BMVC, 2022 - bmvc2022.mpi-inf.mpg.de
The goal of unsupervised cross-modal hashing (UCMH) is to map different modalities into a
semantic-preserving hamming space without requiring label supervision. Existing deep …

Hugs Bring Double Benefits: Unsupervised Cross-Modal Hashing with Multi-granularity Aligned Transformers

J Wang, Z Zeng, B Chen, Y Wang, D Liao, G Li… - International Journal of …, 2024 - Springer
Unsupervised cross-modal hashing (UCMH) has been commonly explored to support large-
scale cross-modal retrieval of unlabeled data. Despite promising progress, most existing …

Efficient Self-Supervised Video Hashing with Selective State Spaces

J Wang, N Lian, J Li, Y Wang, Y Feng, B Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Self-supervised video hashing (SSVH) is a practical task in video indexing and retrieval.
Although Transformers are predominant in SSVH for their impressive temporal modeling …

Motion-aware dynamic graph neural network for video compressive sensing

R Lu, Z Cheng, B Chen, X Yuan - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Video snapshot compressive imaging (SCI) utilizes a 2D detector to capture sequential
video frames and compress them into a single measurement. Various reconstruction …