Diffusion model-based contrastive learning for human activity recognition

C Xiao, Y Han, W Yang, Y Hou, F Shi… - IEEE Internet of Things …, 2024 - ieeexplore.ieee.org
WiFi channel state information (CSI)-based activity recognition has sparked numerous
studies due to its widespread availability and privacy protection. However, when applied in …

Towards zero-shot amplifier modeling: One-to-many amplifier modeling via tone embedding control

YH Chen, YT Yeh, YC Cheng, JT Wu, YH Ho… - arXiv preprint arXiv …, 2024 - arxiv.org
Replicating analog device circuits through neural audio effect modeling has garnered
increasing interest in recent years. Existing work has predominantly focused on a one-to …

A generative framework for designing interactions to overcome the gaps between humans and imperfect AIs instead of improving the accuracy of the AIs

H Yakura - Extended Abstracts of the 2023 CHI Conference on …, 2023 - dl.acm.org
My research focuses on improving human-machine collaboration in the context of machine
learning, particularly by recognizing the limitations and potential for errors in machine …

Toward leveraging pre-trained self-supervised frontends for automatic singing voice understanding tasks: Three case studies

Y Yamamoto - 2023 Asia Pacific Signal and Information …, 2023 - ieeexplore.ieee.org
Automatic singing voice understanding tasks, such as singer identification, singing voice
transcription, and singing technique classification, benefit from data-driven approaches that …

Singer Identity Representation Learning using Self-Supervised Techniques

B Torres, S Lattner, G Richard - arXiv preprint arXiv:2401.05064, 2024 - arxiv.org
Significant strides have been made in creating voice identity representations using speech
data. However, the same level of progress has not been achieved for singing voices. To …

Person Identification Using Bronchial Breath Sounds Recorded by Mobile Devices

VT Tran, YL Lin, WH Tsai - IEEE Access, 2023 - ieeexplore.ieee.org
This study examines the use of breath sounds intrusively recorded by mobile devices for
person identification (PID), which is referred to as mobile-sensed BreathPID. A custom …

From Real to Cloned Singer Identification

D Desblancs, G Meseguer-Brocal, R Hennequin… - arXiv preprint arXiv …, 2024 - arxiv.org
Cloned voices of popular singers sound increasingly realistic and have gained popularity
over the past few years. They however pose a threat to the industry due to personality rights …

Multi-Source Contrastive Learning from Musical Audio

C Garoufis, A Zlatintsi, P Maragos - arXiv preprint arXiv:2302.07077, 2023 - arxiv.org
Contrastive learning constitutes an emerging branch of self-supervised learning that
leverages large amounts of unlabeled data, by learning a latent space, where pairs of …

Identification of Non-Speaking and Minimal-Speaking Individuals Using Nonverbal Vocalizations

VT Tran, WH Tsai - IEEE Access, 2024 - ieeexplore.ieee.org
Speech remains a prevalent mode of communication powering various intelligent functions
in human-computer interaction applications, notably in Speaker/Person Identification (PID) …

An Experimental Comparison of Multi-View Self-Supervised Methods for Music Tagging

G Meseguer-Brocal, D Desblancs… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Self-supervised learning has emerged as a powerful way to pre-train generalizable machine
learning models on large amounts of unlabeled data. It is particularly compelling in the …