Learning consistent feature representation for cross-modal multimedia retrieval

Y Peng, X Huang, Y Zhao - … on circuits and systems for video …, 2017 - ieeexplore.ieee.org

Multimedia retrieval plays an indispensable role in big data utilization. Past efforts mainly
focused on single-media retrieval. However, the requirements of users are highly flexible …

被引用次数：342 相关文章所有 4 个版本

[PDF] bournemouth.ac.uk

Comparative analysis on cross-modal information retrieval: A review

P Kaur, HS Pannu, AK Malhi - Computer Science Review, 2021 - Elsevier

Human beings experience life through a spectrum of modes such as vision, taste, hearing,
smell, and touch. These multiple modes are integrated for information processing in our …

被引用次数：80 相关文章所有 3 个版本

[PDF] springer.com

Rescaling egocentric vision: Collection, pipeline and challenges for epic-kitchens-100

D Damen, H Doughty, GM Farinella, A Furnari… - International Journal of …, 2022 - Springer

This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-
KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M …

被引用次数：419 相关文章所有 13 个版本

Learning discriminative binary codes for large-scale cross-modal retrieval

X Xu, F Shen, Y Yang, HT Shen… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org

Hashing based methods have attracted considerable attention for efficient cross-modal
retrieval on large-scale multimedia data. The core problem of cross-modal hashing is how to …

被引用次数：418 相关文章所有 8 个版本

Learning a joint affinity graph for multiview subspace clustering

C Tang, X Zhu, X Liu, M Li, P Wang… - IEEE Transactions …, 2018 - ieeexplore.ieee.org

With the ability to exploit the internal structure of data, graph-based models have received a
lot of attention and have achieved great success in multiview subspace clustering for …

被引用次数：257 相关文章

Exploiting subspace relation in semantic labels for cross-modal hashing

HT Shen, L Liu, Y Yang, X Xu, Z Huang… - … on Knowledge and …, 2020 - ieeexplore.ieee.org

Hashing methods have been extensively applied to efficient multimedia data indexing and
retrieval on account of the explosion of multimedia data. Cross-modal hashing usually …

被引用次数：185 相关文章所有 3 个版本

[PDF] arxiv.org

CM-GANs: Cross-modal generative adversarial networks for common representation learning

Y Peng, J Qi - ACM Transactions on Multimedia Computing …, 2019 - dl.acm.org

It is known that the inconsistent distributions and representations of different modalities, such
as image and text, cause the heterogeneity gap, which makes it very challenging to correlate …

被引用次数：295 相关文章所有 4 个版本

Deep multi-view subspace clustering with unified and discriminative learning

Q Wang, J Cheng, Q Gao, G Zhao… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Deep multi-view subspace clustering has achieved promising performance compared with
other multi-view clustering. However, existing deep multi-view subspace clustering only …

被引用次数：118 相关文章所有 2 个版本

[PDF] ict.ac.cn

Know more say less: Image captioning based on scene graphs

X Li, S Jiang - IEEE Transactions on Multimedia, 2019 - ieeexplore.ieee.org

Automatically describing the content of an image has been attracting considerable research
attention in the multimedia field. To represent the content of an image, many approaches …

被引用次数：180 相关文章所有 5 个版本

[PDF] arxiv.org

Unsupervised person re-identification by deep asymmetric metric embedding

HX Yu, A Wu, WS Zheng - IEEE transactions on pattern …, 2018 - ieeexplore.ieee.org

Person re-identification (Re-ID) aims to match identities across non-overlapping camera
views. Researchers have proposed many supervised Re-ID models which require quantities …

被引用次数：178 相关文章所有 7 个版本

高级搜索

QQ 群