Google's Next-Generation Real-Time Unit-Selection Synthesizer Using Sequence-to-Sequence...

A Saeed, D Grangier… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

We introduce COLA, a self-supervised pre-training approach for learning a general-purpose
representation of audio. Our approach is based on contrastive learning: it learns a …

被引用次数：306 相关文章所有 7 个版本

[PDF] github.io

Multi-label LSTM autoencoder for non-intrusive appliance load monitoring

S Verma, S Singh, A Majumdar - Electric Power Systems Research, 2021 - Elsevier

This work follows the multi-label classification based paradigm for non-intrusive load
monitoring (NILM). Power consumption signals used for NILM are inherently time varying …

被引用次数：48 相关文章所有 5 个版本

[PDF] ejournal.org.cn

[PDF][PDF] 语音伪造及检测技术研究综述

任延珍，刘晨雨，刘武洋，王丽娜 - 信号处理, 2021 - signal.ejournal.org.cn

语音承载着人类语言和说话人身份信息, 通过语音伪造技术可以精确模仿目标说话人的声音以
达到欺骗人或机器听觉的目的. 目前, 深度伪造(Deepfake) 正在对全球的政治经济及社会稳定带 …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Uncovering latent style factors for expressive speech synthesis

Y Wang, RJ Skerry-Ryan, Y Xiao, D Stanton… - arXiv preprint arXiv …, 2017 - arxiv.org

Prosodic modeling is a core problem in speech synthesis. The key challenge is producing
desirable prosody from textual input containing only phonetic information. In this preliminary …

被引用次数：89 相关文章所有 4 个版本

[PDF] sorbonne-universite.fr

Sequence-to-sequence modelling of f0 for speech emotion conversion

C Robinson, N Obin, A Roebel - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org

Voice interfaces are becoming wildly popular and driving demand for more advanced
speech synthesis and voice transformation systems. Current text-to-speech methods …

被引用次数：57 相关文章所有 9 个版本

[PDF] arxiv.org

pAElla: Edge AI-Based Real-Time Malware Detection in Data Centers

A Libri, A Bartolini, L Benini - IEEE Internet of Things Journal, 2020 - ieeexplore.ieee.org

The increasing use of Internet-of-Things (IoT) devices for monitoring a wide spectrum of
applications, along with the challenges of “big data” streaming support they often require for …

被引用次数：39 相关文章所有 5 个版本

An investigation of noise shaping with perceptual weighting for WaveNet-based speech generation

K Tachibana, T Toda, Y Shiga… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

We propose a noise shaping method to improve the sound quality of speech signals
generated by WaveNet, which is a convolutional neural network (CNN) that predicts a …

被引用次数：51 相关文章所有 5 个版本

[PDF] arxiv.org

In other news: A bi-style text-to-speech model for synthesizing newscaster voice with limited data

N Prateek, M Łajszczak, R Barra-Chicote… - arXiv preprint arXiv …, 2019 - arxiv.org

Neural text-to-speech synthesis (NTTS) models have shown significant progress in
generating high-quality speech, however they require a large quantity of training data. This …

被引用次数：34 相关文章所有 9 个版本

[PDF] hal.science

Deep metric learning for visual servoing: when pose and image meet in latent space

S Felton, E Fromont, E Marchand - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

We propose a new visual servoing method that controls a robot's motion in a latent space.
We aim to extract the best properties of two previously proposed servoing methods: we seek …

被引用次数：9 相关文章所有 5 个版本

[PDF] github.io

Control Channel Isolation in SDN Virtualization: A Machine Learning Approach

Y Yoo, G Yang, C Shin, J Lee… - 2023 IEEE/ACM 23rd …, 2023 - ieeexplore.ieee.org

Performance isolation is an essential property that network virtualization must provide for
clouds. This study addresses the performance isolation of the control plane in virtualized …

被引用次数：8 相关文章所有 5 个版本

高级搜索

QQ 群