Nlp evaluation in trouble: On the need to measure llm data contamination for each benchmark O Sainz, JA Campos, I García-Ferrero, J Etxaniz, OL de Lacalle, E Agirre arXiv preprint arXiv:2310.18018, 2023 | 77 | 2023 |
Do Multilingual Language Models Think Better in English? J Etxaniz, G Azkune, A Soroa, OL de Lacalle, M Artetxe arXiv preprint arXiv:2308.01223, 2023 | 18 | 2023 |
Lessons from the Trenches on Reproducible Evaluation of Language Models S Biderman, H Schoelkopf, L Sutawika, L Gao, J Tow, B Abbasi, AF Aji, ... arXiv preprint arXiv:2405.14782, 2024 | 8 | 2024 |
Latxa: An open language model and evaluation suite for Basque J Etxaniz, O Sainz, N Perez, I Aldabe, G Rigau, E Agirre, A Ormazabal, ... arXiv preprint arXiv:2403.20266, 2024 | 5 | 2024 |
BertaQA: How Much Do Language Models Know About Local Culture? J Etxaniz, G Azkune, A Soroa, OL de Lacalle, M Artetxe arXiv preprint arXiv:2406.07302, 2024 | | 2024 |
XNLIeu: a dataset for cross-lingual NLI in Basque M Heredia, J Etxaniz, M Zulaika, X Saralegi, J Barnes, A Soroa arXiv preprint arXiv:2404.06996, 2024 | | 2024 |
IKER-GAITU: research on language technology for Basque and other low-resource languages E Agirre, I Aldabe, X Arregi, M Artetxe, U Atutxa, E Azurmendi, ... | | 2024 |
Grounding Language Models for Compositional and Spatial Reasoning J Etxaniz Aragoneses | | 2023 |
ProMeta: softwarearen garapenerako prozesuen definizio eta ezarpenerako sistema metaereduetan oinarrituta J Etxaniz Aragoneses ADDI, 2021 | | 2021 |