S Yang, AT Liu, H Lee - arXiv preprint arXiv:2006.03265, 2020 - arxiv.org
Self-supervised Audio Transformers (SAT) enable great success in many downstream speech applications like ASR, but how they work has not been widely explored yet. In this …
É Székely, GE Henter… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
This paper considers utilising breaths to create improved spontaneous-speech corpora for conversational text-to-speech from found audio recordings such as dialogue podcasts …
GP Yang, H Tang - 2024 IEEE Spoken Language Technology …, 2024 - ieeexplore.ieee.org
Despite the recent advance in self-supervised representations, unsupervised phonetic segmentation remains challenging. Most approaches focus on improving phonetic …
The objective of this paper is to develop an unsupervised method for segmentation of speech signals into phoneme-like units. The proposed algorithm is based on the …
S Bhati, S Nayak, SRM Kodukula - Circuits, Systems, and Signal …, 2020 - Springer
This paper presents a new approach for unsupervised segmentation and labeling of acoustically homogeneous segments from the speech signals. The virtual labels, thus …
KK Ravi, SR Krothapalli - Circuits, Systems, and Signal Processing, 2022 - Springer
This paper proposes a new method that detects the repeated keyword/phrase patterns from speech utterances by performing pattern discovery at the phoneme level. Prior to this …
Voice-enabled interfaces for human-machine interaction have made significant progress in recent years. Most of the success can be attributed to deep neural networks trained on …
Nowadays with the recent development in Brain Computer Interfaces (BCI), research field branches to arts and especially to music. In our study, a system is developed which analyses …
M Gavrikov, R Sinetsky - 2022 - repo.bibliothek.uni-halle.de
An algorithm for large-scale adaptation of prototype functions representing image classes is proposed. The algorithm identifies the parameters of nonlinear scale distortions contained in …