Human beings experience life through a spectrum of modes such as vision, taste, hearing, smell, and touch. These multiple modes are integrated for information processing in our …
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC- KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M …
X Xu, F Shen, Y Yang, HT Shen… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Hashing based methods have attracted considerable attention for efficient cross-modal retrieval on large-scale multimedia data. The core problem of cross-modal hashing is how to …
C Tang, X Zhu, X Liu, M Li, P Wang… - IEEE Transactions …, 2018 - ieeexplore.ieee.org
With the ability to exploit the internal structure of data, graph-based models have received a lot of attention and have achieved great success in multiview subspace clustering for …
Hashing methods have been extensively applied to efficient multimedia data indexing and retrieval on account of the explosion of multimedia data. Cross-modal hashing usually …
Y Peng, J Qi - ACM Transactions on Multimedia Computing …, 2019 - dl.acm.org
It is known that the inconsistent distributions and representations of different modalities, such as image and text, cause the heterogeneity gap, which makes it very challenging to correlate …
Deep multi-view subspace clustering has achieved promising performance compared with other multi-view clustering. However, existing deep multi-view subspace clustering only …
X Li, S Jiang - IEEE Transactions on Multimedia, 2019 - ieeexplore.ieee.org
Automatically describing the content of an image has been attracting considerable research attention in the multimedia field. To represent the content of an image, many approaches …
HX Yu, A Wu, WS Zheng - IEEE transactions on pattern …, 2018 - ieeexplore.ieee.org
Person re-identification (Re-ID) aims to match identities across non-overlapping camera views. Researchers have proposed many supervised Re-ID models which require quantities …