A comparative study on transformer vs rnn in speech applications S Karita, N Chen, T Hayashi, T Hori, H Inaguma, Z Jiang, M Someki, ... 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 821 | 2019 |
ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit T Hayashi, R Yamamoto, K Inoue, T Yoshimura, S Watanabe, T Toda, ... ICASSP 2020-2020 IEEE international conference on acoustics, speech and …, 2020 | 226 | 2020 |
Espnet2-tts: Extending the edge of tts research T Hayashi, R Yamamoto, T Yoshimura, P Wu, J Shi, T Saeki, Y Ju, ... arXiv preprint arXiv:2110.07840, 2021 | 52 | 2021 |
End-to-end automatic speech recognition integrated with ctc-based voice activity detection T Yoshimura, T Hayashi, K Takeda, S Watanabe ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 48 | 2020 |
A hierarchical predictor of synthetic speech naturalness using neural networks T Yoshimura, GE Henter, O Watts, M Wester, J Yamagishi, K Tokuda Interspeech 2016, 342-346, 2016 | 34 | 2016 |
Conformer-based id-aware autoencoder for unsupervised anomalous sound detection T Hayashi, T Yoshimura, Y Adachi DCASE2020 Challenge, Tech. Rep., 2020 | 26 | 2020 |
Statistical voice conversion based on WaveNet J Niwa, T Yoshimura, K Hashimoto, K Oura, Y Nankaku, K Tokuda 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 23 | 2018 |
Mel-cepstrum-based quantization noise shaping applied to neural-network-based speech waveform synthesis T Yoshimura, K Hashimoto, K Oura, Y Nankaku, K Tokuda IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (7), 1177 …, 2018 | 16 | 2018 |
Anomalous sound detection with ensemble of autoencoder and binary classification approaches I Kuroyanagi, T Hayashi, Y Adachi, T Yoshimura, K Takeda, T Toda DCASE2021 Challenge, 2021 | 14 | 2021 |
Speaker-dependent WaveNet-based delay-free ADPCM speech coding T Yoshimura, K Hashimoto, K Oura, Y Nankaku, K Tokuda ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 12 | 2019 |
An Ensemble Approach to Anomalous Sound Detection Based on Conformer-Based Autoencoder and Binary Classifier Incorporated with Metric Learning. I Kuroyanagi, T Hayashi, Y Adachi, T Yoshimura, K Takeda, T Toda DCASE, 110-114, 2021 | 11 | 2021 |
Cross-lingual speaker adaptation based on factor analysis using bilingual speech data for HMM-based speech synthesis. T Yoshimura, K Hashimoto, K Oura, Y Nankaku, K Tokuda SSW, 297-302, 2013 | 9 | 2013 |
Articulatory text-to-speech synthesis using the digital waveguide mesh driven by a deep neural network AJ Gully, T Yoshimura, DT Murphy, K Hashimoto, Y Nankaku, K Tokuda Interspeech 2017, 234-238, 2017 | 7 | 2017 |
Neural sequence-to-sequence speech synthesis using a hidden semi-Markov model based structured attention mechanism Y Nankaku, K Sumiya, T Yoshimura, S Takaki, K Hashimoto, K Oura, ... arXiv preprint arXiv:2108.13985, 2021 | 6 | 2021 |
SPTK4: An Open-Source Software Toolkit for Speech Signal Processing T Yoshimura, T Fujimoto, K Oura, K Tokuda 12th Speech Synthesis Workshop (SSW) 2023, 2023 | 5 | 2023 |
Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System T Yoshimura, S Takaki, K Nakamura, K Oura, Y Hono, K Hashimoto, ... ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 5 | 2023 |
Speech synthesis using wavenet vocoder based on periodic/aperiodic decomposition T Fujimoto, T Yoshimura, K Hashimoto, K Oura, Y Nankaku, K Tokuda 2018 Asia-Pacific Signal and Information Processing Association Annual …, 2018 | 5 | 2018 |
Discriminative feature extraction based on sequential variational autoencoder for speaker recognition T Yoshimura, N Koike, K Hashimoto, K Oura, Y Nankaku, K Tokuda 2018 Asia-Pacific Signal and Information Processing Association Annual …, 2018 | 5 | 2018 |
Spontaneous Speech Summarization: Transformers All The Way Through T Hayashi, T Yoshimura, M Inuzuka, I Kuroyanagi, O Segawa 2021 29th European Signal Processing Conference (EUSIPCO), 456-460, 2021 | 4 | 2021 |
WaveNet-based zero-delay lossless speech coding T Yoshimura, K Hashimoto, K Oura, Y Nankaku, K Tokuda 2018 IEEE Spoken Language Technology Workshop (SLT), 153-158, 2018 | 4 | 2018 |