Beyond supervised learning for pervasive healthcare

X Gu, F Deligianni, J Han, X Liu, W Chen… - IEEE Reviews in …, 2023 - ieeexplore.ieee.org
The integration of machine/deep learning and sensing technologies is transforming
healthcare and medical practice. However, inherent limitations in healthcare data, namely …

Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives

K Grauman, A Westbury, L Torresani… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …

Transformer-based attention networks for continuous pixel-wise prediction

G Yang, H Tang, M Ding, N Sebe… - Proceedings of the …, 2021 - openaccess.thecvf.com
While convolutional neural networks have shown a tremendous impact on various computer
vision tasks, they generally demonstrate limitations in explicitly modeling long-range …

Attentiongan: Unpaired image-to-image translation using attention-guided generative adversarial networks

H Tang, H Liu, D Xu, PHS Torr… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
State-of-the-art methods in the image-to-image translation are capable of learning a
mapping from a source domain to a target domain with unpaired image data. Though the …

An outlook into the future of egocentric vision

C Plizzari, G Goletto, A Furnari, S Bansal… - International Journal of …, 2024 - Springer
What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …

Multi-channel attention selection gans for guided image-to-image translation

H Tang, PHS Torr, N Sebe - IEEE Transactions on Pattern …, 2022 - ieeexplore.ieee.org
We propose a novel model named Multi-Channel Attention Selection Generative Adversarial
Network (SelectionGAN) for guided image-to-image translation, where we translate an input …

Multi-modal perception attention network with self-supervised learning for audio-visual speaker tracking

Y Li, H Liu, H Tang - Proceedings of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org
Multi-modal fusion is proven to be an effective method to improve the accuracy and
robustness of speaker tracking, especially in complex scenarios. However, how to combine …

Put myself in your shoes: Lifting the egocentric perspective from exocentric videos

M Luo, Z Xue, A Dimakis, K Grauman - arXiv preprint arXiv:2403.06351, 2024 - arxiv.org
We investigate exocentric-to-egocentric cross-view translation, which aims to generate a first-
person (egocentric) view of an actor based on a video recording that captures the actor from …

Cascaded cross mlp-mixer gans for cross-view image translation

B Ren, H Tang, N Sebe - arXiv preprint arXiv:2110.10183, 2021 - arxiv.org
It is hard to generate an image at target view well for previous cross-view image translation
methods that directly adopt a simple encoder-decoder or U-Net structure, especially for …

Progressively unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation

HY Lee, YH Li, TH Lee, MS Aslam - Sensors, 2023 - mdpi.com
Unsupervised image-to-image translation has received considerable attention due to the
recent remarkable advancements in generative adversarial networks (GANs). In image-to …