What does BERT learn about the structure of language? G Jawahar, B Sagot, D Seddah 57th Annual Meeting of the Association for Computational Linguistics (ACL …, 2019 | 1409 | 2019 |
Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1365 | 2023 |
CamemBERT: a Tasty French Language Model L Martin, B Muller, PJ Ortiz Suárez, Y Dupont, L Romary, ... Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020 | 1123 | 2020 |
Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures PJ Ortiz Suárez, B Sagot, L Romary Challenges in the Management of Large Corpora (CMLC-7) 2019, 9, 2019 | 437* | 2019 |
The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French B Sagot LREC 2010, 2010 | 303* | 2010 |
Building a free French wordnet from multilingual resources B Sagot, D Fišer Ontolex 2008, 2008 | 250 | 2008 |
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages PJ Ortiz Suárez, L Romary, B Sagot Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020 | 222* | 2020 |
Coupling an annotated corpus and a morphosyntactic lexicon for state-of-the-art POS tagging with less human effort P Denis, B Sagot PACLIC 2009, 2009 | 171 | 2009 |
Controllable sentence simplification L Martin, B Sagot, E de la Clergerie, A Bordes arXiv preprint arXiv:1910.02677, 2019 | 169 | 2019 |
MUSS: Multilingual unsupervised sentence simplification by mining paraphrases L Martin, A Fan, E De La Clergerie, A Bordes, B Sagot arXiv preprint arXiv:2005.00352, 2020 | 150* | 2020 |
ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations F Alva-Manchego, L Martin, A Bordes, C Scarton, B Sagot, L Specia Proceedings of the 58th Annual Meeting of the Association for Computational …, 2020 | 139 | 2020 |
Universal dependencies 2.5 D Zeman, J Nivre, et al. LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied …, 2020 | 128 | 2020 |
Towards a cleaner document-oriented multilingual crawled corpus J Abadji, PO Suarez, L Romary, B Sagot arXiv preprint arXiv:2201.06642, 2022 | 125 | 2022 |
When being unseen from mBERT is just the beginning: Handling new languages with multilingual language models B Muller, A Anastasopoulos, B Sagot, D Seddah arXiv preprint arXiv:2010.12858, 2020 | 125 | 2020 |
The Lefff 2 syntactic lexicon for French: architecture, acquisition, use B Sagot, L Clément, E de La Clergerie, P Boullier LREC 2006, 2006 | 112 | 2006 |
Quality at a glance: An audit of web-crawled multilingual datasets J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... Transactions of the Association for Computational Linguistics 10, 50-72, 2022 | 110 | 2022 |
Influence of pre-annotation on POS-tagged corpus development K Fort, B Sagot The fourth ACL linguistic annotation workshop, 56-63, 2010 | 105 | 2010 |
Coupling an annotated corpus and a lexicon for state-of-the-art POS tagging P Denis, B Sagot Language resources and evaluation 46 (4), 721-736, 2012 | 95 | 2012 |
Morphology based automatic acquisition of large-coverage lexica L Clément, B Lang, B Sagot LREC 2004, 2004 | 84 | 2004 |
The CoMeRe corpus for French: structuring and annotating heterogeneous CMC genres T Chanier, C Poudat, B Sagot, G Antoniadis, CR Wigham, L Hriba, ... Journal for language technology and computational linguistics 29 (2), 1-30, 2014 | 79 | 2014 |