An overview of voice conversion and its challenges: From statistical modeling to deep learning

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …

Dress is a fundamental component of person perception

N Hester, E Hehman - Personality and Social Psychology …, 2023 - journals.sagepub.com
Academic Abstract Clothing, hairstyle, makeup, and accessories influence first impressions.
However, target dress is notably absent from current theories and models of person …

Emotional voice conversion: Theory, databases and ESD

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier
In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

Gan memory with no forgetting

Y Cong, M Zhao, J Li, S Wang… - Advances in Neural …, 2020 - proceedings.neurips.cc
As a fundamental issue in lifelong learning, catastrophic forgetting is directly caused by
inaccessible historical data; accordingly, if the data (information) were memorized perfectly …

Learning attribute-driven disentangled representations for interactive fashion retrieval

Y Hou, E Vig, M Donoser… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Interactive retrieval for online fashion shopping provides the ability of changing image
retrieval results according to the user feedback. One common problem in interactive retrieval …

On leveraging pretrained gans for generation with limited data

M Zhao, Y Cong, L Carin - International Conference on …, 2020 - proceedings.mlr.press
Recent work has shown generative adversarial networks (GANs) can generate highly
realistic images, that are often indistinguishable (by humans) from real images. Most images …

Artificial intelligence in the fashion industry: consumer responses to generative adversarial network (GAN) technology

K Sohn, CE Sung, G Koo, O Kwon - International Journal of Retail & …, 2020 - emerald.com
Purpose This study examines consumers' evaluations of product consumption values,
purchase intentions and willingness to pay for fashion products designed using generative …

A review on Single Image Super Resolution techniques using generative adversarial network

K Singla, R Pandey, U Ghanekar - Optik, 2022 - Elsevier
Abstract Single Image Super Resolution (SISR) is a process to obtain a high pixel density
and refined details from a low resolution (LR) image to get upscaled and sharper high …

Transforming spectrum and prosody for emotional voice conversion with non-parallel training data

K Zhou, B Sisman, H Li - arXiv preprint arXiv:2002.00198, 2020 - arxiv.org
Emotional voice conversion aims to convert the spectrum and prosody to change the
emotional patterns of speech, while preserving the speaker identity and linguistic content …

[HTML][HTML] Talking human face generation: A survey

M Toshpulatov, W Lee, S Lee - Expert Systems with Applications, 2023 - Elsevier
Talking human face generation aims at synthesizing a natural human face that talks in
correspondence to the given text or audio series. Implementing the recently developed …