Unsupervised Cross‐Media Hashing Learning via Knowledge Graph

Z Zhang, L Li, G Cong, H Yin, Y Gao, C Yan… - Proceedings of the …, 2024 - dl.acm.org

Movie Dubbing aims to convert scripts into speeches that align with the given movie clip in
both temporal and emotional aspects while preserving the vocal timbre of one brief …

被引用次数：8 相关文章所有 4 个版本

[PDF] arxiv.org

Training-free video temporal grounding using large-scale pre-trained models

M Zheng, X Cai, Q Chen, Y Peng, Y Liu - European Conference on …, 2024 - Springer

Video temporal grounding aims to identify video segments within untrimmed videos that are
most relevant to a given natural language query. Existing video temporal localization models …

被引用次数：4 相关文章所有 7 个版本

[PDF] arxiv.org

It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment

J Zheng, X Liu, B Zhang, C Yan, J Zhang… - Proceedings of the …, 2024 - dl.acm.org

Existing studies for gait recognition primarily utilized sequences of either binary silhouette or
human parsing to encode the shapes and dynamics of persons during walking. Silhouettes …

被引用次数：2 相关文章所有 4 个版本

[PDF] openreview.net

Stochastic Context Consistency Reasoning for Domain Adaptive Object Detection

Y Cui, L Li, J Zhang, C Yan, H Wang, S Wang… - Proceedings of the …, 2024 - dl.acm.org

Domain Adaptive Object Detection (DAOD) aims to improve the adaptation of the detector for
the unlabeled target domain by the labeled source domain. Recent advances leverage a …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

EventHDR: From Event to High-Speed HDR Videos and Beyond

Y Zou, Y Fu, T Takatani, Y Zheng - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Event cameras are innovative neuromorphic sensors that asynchronously capture the scene
dynamics. Due to the event-triggering mechanism, such cameras record event streams with …

被引用次数：1 相关文章所有 8 个版本

[PDF] github.io

Mitigate Catastrophic Remembering via Continual Knowledge Purification for Noisy Lifelong Person Re-Identification

K Xu, H Zhang, Y Li, Y Peng, J Zhou - Proceedings of the 32nd ACM …, 2024 - dl.acm.org

Current Lifelong Person Re-Identification (LReID) methods focus on tackling a clean data
stream with accurate labels. When noisy data with incorrect labels are given, their …

被引用次数：1 相关文章所有 4 个版本

[PDF] github.io

Progressive Prototype Evolving for Dual-Forgetting Mitigation in Non-Exemplar Online Continual Learning

Q Li, Y Peng, J Zhou - Proceedings of the 32nd ACM International …, 2024 - dl.acm.org

Online Continual Learning (OCL) aims at learning a model through a sequence of single-
pass data, usually encountering the challenges of catastrophic forgetting both between …

被引用次数：1 相关文章所有 4 个版本

[PDF] github.io

InsVP: Efficient Instance Visual Prompting from Image Itself

Z Liu, Y Peng, J Zhou - Proceedings of the 32nd ACM International …, 2024 - dl.acm.org

Visual prompting is an efficient methodology for finetuning pretrained visual models by
introducing a small number of learnable parameters while keeping the backbone frozen …

被引用次数：1 相关文章所有 4 个版本

Object-Aware NIR-to-Visible Translation

Y Gao, L Gu, Q Liu, Y Fu - European Conference on Computer Vision, 2024 - Springer

While near-infrared (NIR) imaging is essential for assisted driving and safety monitoring
systems, its monochromatic nature hinders its broader application, which prompts the …

Privacy-enhanced prototype-based federated cross-modal hashing for cross-modal retrieval

R Zuo, C Zheng, F Li, L Zhu, Z Zhang - ACM Transactions on Multimedia …, 2024 - dl.acm.org

Cross-modal hashing is widely used for efficient similarity searches, improving data
processing efficiency, and reducing storage costs. Existing cross-modal hashing methods …

被引用次数：1 相关文章

高级搜索

QQ 群