Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models

A Jain, A Xie, P Abbeel - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Diffusion models have shown impressive results in text-to-image synthesis. Using massive
datasets of captioned images, diffusion models learn to generate raster images of highly …

The challenge of realistic music generation: modelling raw audio at scale

S Dieleman, A Van Den Oord… - Advances in neural …, 2018 - proceedings.neurips.cc
Realistic music generation is a challenging task. When building generative models of music
that are learnt from data, typically high-level representations such as scores or MIDI are …

Parallel iterative edit models for local sequence transduction

A Awasthi, S Sarawagi, R Goyal, S Ghosh… - arXiv preprint arXiv …, 2019 - arxiv.org
We present a Parallel Iterative Edit (PIE) model for the problem of local sequence
transduction arising in tasks like Grammatical error correction (GEC). Recent approaches …

Insertion-based decoding with automatically inferred generation order

J Gu, Q Liu, K Cho - Transactions of the Association for Computational …, 2019 - direct.mit.edu
Conventional neural autoregressive decoding commonly assumes a fixed left-to-right
generation order, which may be sub-optimal. In this work, we propose a novel decoding …

Real-time voice cloning

C Jemine - 2019 - matheo.uliege.be
Recent advances in deep learning have shown impressive results in the domain of text-to-
speech. To this end, a deep neural network is usually trained using a corpus of several …

Neural canonical transformation with symplectic flows

SH Li, CX Dong, L Zhang, L Wang - Physical Review X, 2020 - APS
Canonical transformation plays a fundamental role in simplifying and solving classical
Hamiltonian systems. Intriguingly, it has a natural correspondence to normalizing flows with …

Seq-u-net: A one-dimensional causal u-net for efficient sequence modelling

D Stoller, M Tian, S Ewert, S Dixon - arXiv preprint arXiv:1911.06393, 2019 - arxiv.org
Convolutional neural networks (CNNs) with dilated filters such as the Wavenet or the
Temporal Convolutional Network (TCN) have shown good results in a variety of sequence …

Complex-valued neural networks for machine learning on non-stationary physical data

JS Dramsch, M Lüthje, AN Christensen - Computers & Geosciences, 2021 - Elsevier
Deep learning has become an area of interest in most scientific areas, including physical
sciences. Modern networks apply real-valued transformations on the data. Particularly …

[PDF][PDF] Fast and Flexible Neural Audio Synthesis.

L Hantrakul, JH Engel, A Roberts, C Gu - Ismir, 2019 - archives.ismir.net
Autoregressive neural networks, such as WaveNet, have opened up new avenues for
expressive audio synthesis. High-quality speech synthesis utilizes detailed linguistic …

Anytime sampling for autoregressive models via ordered autoencoding

Y Xu, Y Song, S Garg, L Gong, R Shu, A Grover… - arXiv preprint arXiv …, 2021 - arxiv.org
Autoregressive models are widely used for tasks such as image and audio generation. The
sampling process of these models, however, does not allow interruptions and cannot adapt …