Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised learning (SSL) targets discovering general representations from large-scale data. This …
We introduce MusicLM, a model generating high-fidelity music from text descriptions such as" a calming violin melody backed by a distorted guitar riff". MusicLM casts the process of …
Self-supervised learning (SSL) has proven vital for advancing research in natural language processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on …
We introduce Noise2Music, where a series of diffusion models is trained to generate high- quality 30-second music clips from text prompts. Two types of diffusion models, a generator …
S Deshmukh, B Elizalde, R Singh… - Advances in Neural …, 2023 - proceedings.neurips.cc
In the domain of audio processing, Transfer Learning has facilitated the rise of Self- Supervised Learning and Zero-Shot Learning techniques. These approaches have led to …
We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models pre-trained using large, diverse unlabeled datasets containing approximately a …
A Saeed, D Grangier… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio. Our approach is based on contrastive learning: it learns a …
Y Gong, YA Chung, J Glass - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Audio tagging is an active research area and has a wide range of applications. Since the release of AudioSet, great progress has been made in advancing model performance, which …
D Niizumi, D Takeuchi, Y Ohishi… - … Joint Conference on …, 2021 - ieeexplore.ieee.org
Inspired by the recent progress in self-supervised learning for computer vision that generates supervision using data augmentations, we explore a new general-purpose audio …