Contrastive learning of general-purpose audio representations

A Saeed, D Grangier… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
We introduce COLA, a self-supervised pre-training approach for learning a general-purpose
representation of audio. Our approach is based on contrastive learning: it learns a …

Multi-label LSTM autoencoder for non-intrusive appliance load monitoring

S Verma, S Singh, A Majumdar - Electric Power Systems Research, 2021 - Elsevier
This work follows the multi-label classification based paradigm for non-intrusive load
monitoring (NILM). Power consumption signals used for NILM are inherently time varying …

[PDF][PDF] 语音伪造及检测技术研究综述

任延珍, 刘晨雨, 刘武洋, 王丽娜 - 信号处理, 2021 - signal.ejournal.org.cn
语音承载着人类语言和说话人身份信息, 通过语音伪造技术可以精确模仿目标说话人的声音以
达到欺骗人或机器听觉的目的. 目前, 深度伪造(Deepfake) 正在对全球的政治经济及社会稳定带 …

Uncovering latent style factors for expressive speech synthesis

Y Wang, RJ Skerry-Ryan, Y Xiao, D Stanton… - arXiv preprint arXiv …, 2017 - arxiv.org
Prosodic modeling is a core problem in speech synthesis. The key challenge is producing
desirable prosody from textual input containing only phonetic information. In this preliminary …

Sequence-to-sequence modelling of f0 for speech emotion conversion

C Robinson, N Obin, A Roebel - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
Voice interfaces are becoming wildly popular and driving demand for more advanced
speech synthesis and voice transformation systems. Current text-to-speech methods …

pAElla: Edge AI-Based Real-Time Malware Detection in Data Centers

A Libri, A Bartolini, L Benini - IEEE Internet of Things Journal, 2020 - ieeexplore.ieee.org
The increasing use of Internet-of-Things (IoT) devices for monitoring a wide spectrum of
applications, along with the challenges of “big data” streaming support they often require for …

An investigation of noise shaping with perceptual weighting for WaveNet-based speech generation

K Tachibana, T Toda, Y Shiga… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
We propose a noise shaping method to improve the sound quality of speech signals
generated by WaveNet, which is a convolutional neural network (CNN) that predicts a …

In other news: A bi-style text-to-speech model for synthesizing newscaster voice with limited data

N Prateek, M Łajszczak, R Barra-Chicote… - arXiv preprint arXiv …, 2019 - arxiv.org
Neural text-to-speech synthesis (NTTS) models have shown significant progress in
generating high-quality speech, however they require a large quantity of training data. This …

Deep metric learning for visual servoing: when pose and image meet in latent space

S Felton, E Fromont, E Marchand - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
We propose a new visual servoing method that controls a robot's motion in a latent space.
We aim to extract the best properties of two previously proposed servoing methods: we seek …

Control Channel Isolation in SDN Virtualization: A Machine Learning Approach

Y Yoo, G Yang, C Shin, J Lee… - 2023 IEEE/ACM 23rd …, 2023 - ieeexplore.ieee.org
Performance isolation is an essential property that network virtualization must provide for
clouds. This study addresses the performance isolation of the control plane in virtualized …