How2: A Large-scale Dataset for Multimodal Language Understanding R Sanabria, O Caglayan, S Palaskar, D Elliott, L Barrault, L Specia, ... NIPS 2018 Workshop, 2018 | 272 | 2018 |
Sequence-based Multi-lingual Low Resource Speech Recognition S Dalmia, R Sanabria, F Metze, AW Black IEEE ICASSP 2018, 2018 | 108 | 2018 |
The IWSLT 2019 evaluation campaign N Jan, R Cattoni, S Sebastian, M Negri, M Turchi, S Elizabeth, S Ramon, ... EMNLP (IWSLT), 2019 | 98* | 2019 |
Hierarchical Multi Task Learning With CTC R Sanabria, F Metze IEEE SLT 2018, 2018 | 67 | 2018 |
Comparison of decoding strategies for ctc acoustic models T Zenkel, R Sanabria, F Metze, J Niehues, M Sperber, S Stüker, A Waibel INTERSPEECH 2017, 2017 | 56 | 2017 |
End-to-End Multimodal Speech Recognition S Palaskar, R Sanabria, F Metze IEEE ICASSP 2018, 2018 | 47 | 2018 |
Cmu sinbads submission for the dstc7 avsd challenge R Sanabria, S Palaskar, F Metze AAAI 2019 (DSTC), 2019 | 43 | 2019 |
Subword and Crossword Units for CTC Acoustic Models T Zenkel, R Sanabria, F Metze, A Waibel INTERSPEECH 2018, 2017 | 38 | 2017 |
Multimodal Grounding for Sequence-to-sequence Speech Recognition O Caglayan, R Sanabria, S Palaskar, L Barraul, F Metze IEEE ICASSP 2019, 2019 | 31 | 2019 |
Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval R Sanabria, A Waters, J Baldridge INTERSPEECH 2021, 2021 | 23 | 2021 |
Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models R Sanabria, H Tang, S Goldwater IEEE ICASSP 2023, 2022 | 19 | 2022 |
Looking Enhances Listening: Recovering Missing Speech Using Images T Srinivasan, R Sanabria, F Metze IEEE ICASSP 2020, 2020 | 15 | 2020 |
Multimodal Speech Recognition with Unstructured Audio Masking T Srinivasan, R Sanabria, F Metze, D Elliott EMNLP 2020 (Workshop), 2020 | 13 | 2020 |
The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR R Sanabria, N Bogoychev, N Markl, A Carmantini, O Klejch, P Bell IEEE ICASSP 2023, 2023 | 12 | 2023 |
Measuring the impact of individual domain factors in self-supervised pre-training R Sanabria, WN Hsu, A Baevski, M Auli IEEE ICASSP 2023 (SASB), 2022 | 11 | 2022 |
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions T Srinivasan, R Sanabria, F Metze ICML 2019 (Workshop), 2019 | 11 | 2019 |
Fine-Grained Grounding for Multimodal Speech Recognition T Srinivasan, R Sanabria, F Metze, D Elliott Findings of EMNLP 2020, 2020 | 9 | 2020 |
Robust end-to-end deep audiovisual speech recognition R Sanabria, F Metze, F De La Torre arXiv preprint arXiv:1611.06986, 2016 | 9 | 2016 |
Transfer learning for multimodal dialog S Palaskar, R Sanabria, F Metze Computer Speech & Language, 2020 | 8 | 2020 |
On the Difficulty of Segmenting Words with Attention R Sanabria, H Tang, S Goldwater EMNLP 2021 (Insights from Negative Results in NLP), 2021 | 6 | 2021 |