modeling of spectral sequences. However, the converted speech still contains traces of
artificial sounds. To alleviate this, it is necessary to statistically model a source sequence as
well as a spectral sequence. In this paper, we introduce STRAIGHT mixed excitation to a
framework of the voice conversion based on a Gaussian Mixture Model (GMM) on joint
probability density of source and target features. We convert both spectral and source …