Fader networks: Manipulating images by sliding attributes G Lample, N Zeghidour, N Usunier, A Bordes, L Denoyer, MA Ranzato Advances in Neural Information Processing Systems, 2017 | 605 | 2017 |
Soundstream: An end-to-end neural audio codec N Zeghidour, A Luebs, A Omran, J Skoglund, M Tagliasacchi IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 495-507, 2021 | 420 | 2021 |
Musiclm: Generating music from text A Agostinelli, TI Denk, Z Borsos, J Engel, M Verzetti, A Caillon, Q Huang, ... arXiv preprint arXiv:2301.11325, 2023 | 405 | 2023 |
Audiolm: a language modeling approach to audio generation Z Borsos, R Marinier, D Vincent, E Kharitonov, O Pietquin, M Sharifi, ... IEEE/ACM transactions on audio, speech, and language processing 31, 2523-2533, 2023 | 359 | 2023 |
Wavesplit: End-to-end speech separation by speaker clustering N Zeghidour, D Grangier IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 2840-2849, 2021 | 263 | 2021 |
Contrastive learning of general-purpose audio representations A Saeed, D Grangier, N Zeghidour ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 259 | 2021 |
LEAF: A Learnable Frontend for Audio Classification N Zeghidour, O Teboul, FC Quitry, M Tagliasacchi ICLR 2021, 2021 | 151 | 2021 |
Learning Filterbanks from Raw Speech for Phone Recognition N Zeghidour, N Usunier, I Kokkinos, T Schatz, G Synnaeve, E Dupoux ICASSP 2018, 2017 | 136 | 2017 |
Fully convolutional speech recognition N Zeghidour, Q Xu, V Liptchinsky, N Usunier, G Synnaeve, R Collobert arXiv preprint arXiv:1812.06864, 2018 | 112 | 2018 |
End-to-end speech recognition from the raw waveform N Zeghidour, N Usunier, G Synnaeve, R Collobert, E Dupoux Interspeech 2018, 2018 | 111 | 2018 |
Speak, read and prompt: High-fidelity text-to-speech with minimal supervision E Kharitonov, D Vincent, Z Borsos, R Marinier, S Girgin, O Pietquin, ... Transactions of the Association for Computational Linguistics 11, 1703-1718, 2023 | 108 | 2023 |
Audiopalm: A large language model that can speak and listen PK Rubenstein, C Asawaroengchai, DD Nguyen, A Bapna, Z Borsos, ... arXiv preprint arXiv:2306.12925, 2023 | 101 | 2023 |
Sing: Symbol-to-instrument neural generator A Défossez, N Zeghidour, N Usunier, L Bottou, F Bach Advances in Neural Information Processing Systems, 2018 | 79 | 2018 |
Joint learning of speaker and phonetic similarities with siamese networks. N Zeghidour, G Synnaeve, N Usunier, E Dupoux INTERSPEECH, 1295-1299, 2016 | 65 | 2016 |
General-purpose, long-context autoregressive modeling with Perceiver AR C Hawthorne, A Jaegle, C Cangea, S Borgeaud, C Nash, M Malinowski, ... International Conference on Machine Learning, 8535-8558, 2022 | 61 | 2022 |
Soundstorm: Efficient parallel audio generation Z Borsos, M Sharifi, D Vincent, E Kharitonov, N Zeghidour, M Tagliasacchi arXiv preprint arXiv:2305.09636, 2023 | 56 | 2023 |
Learning strides in convolutional neural networks R Riad, O Teboul, D Grangier, N Zeghidour arXiv preprint arXiv:2202.01653, 2022 | 49 | 2022 |
A Deep Scattering Spectrum - Deep Siamese network Pipeline For Unsupervised Acoustic Modeling N Zeghidour, G Synnaeve, M Versteegh, E Dupoux 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 48 | 2016 |
Multi-instrument music synthesis with spectrogram diffusion C Hawthorne, I Simon, A Roberts, N Zeghidour, J Gardner, E Manilow, ... arXiv preprint arXiv:2206.05408, 2022 | 44 | 2022 |
To reverse the gradient or not: An empirical comparison of adversarial and multi-task learning in speech recognition Y Adi, N Zeghidour, R Collobert, N Usunier, V Liptchinsky, G Synnaeve ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 43 | 2019 |