Findings of the 2021 conference on machine translation (WMT21)

F Akhbardeh, A Arkhangorodsky, M Biesialska… - Proceedings of the sixth …, 2021 - cris.fbk.eu
This paper presents the results of the news translation task, the multilingual low-resource
translation for Indo-European languages, the triangular translation task, and the automatic …

Language technology programme for Icelandic 2019-2023

AB Nikulásdóttir, J Guðnason, AK Ingason… - arXiv preprint arXiv …, 2020 - arxiv.org
In this paper, we describe a new national language technology programme for Icelandic.
The programme, which spans a period of five years, aims at making Icelandic usable in …

EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models

S Ji, Z Li, I Paul, J Paavola, P Lin, P Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
In this work, we introduce EMMA-500, a large-scale multilingual language model continue-
trained on texts across 546 languages designed for enhanced multilingual performance …

[PDF][PDF] Training and adapting multilingual NMT for less-resourced and morphologically rich languages

M Rikters, M Pinnis, R Krišlauks - Proceedings of the eleventh …, 2018 - aclanthology.org
In this paper, we present results of employing multilingual and multi-way neural machine
translation approaches for morphologically rich languages, such as Estonian and Russian …

Multi-hypothesis machine translation evaluation

M Fomicheva, L Specia… - Proceedings of the 58th …, 2020 - eprints.whiterose.ac.uk
Reliably evaluating Machine Translation (MT) through automated metrics is a long-standing
problem. One of the main challenges is the fact that multiple outputs can be equally valid …

Compiling and filtering ParIce: an English-icelandic parallel corpus

S Barkarson, S Steingrímsson - … of the 22nd Nordic Conference on …, 2019 - aclanthology.org
We present ParIce, a new English-Icelandic parallel corpus. This is the first parallel corpus
built for the purposes of language technology development and research for Icelandic …

Addressing the data gap: building a parallel corpus for Kashmiri language

SMU Qumar, M Azim, SMK Quadri - International Journal of Information …, 2024 - Springer
This paper marks a significant step forward in language technology for low-resource
languages by developing the first parallel corpus for the Kashmiri language, which …

Experimenting with different machine translation models in medium-resource settings

HP Jónsson, HB Símonarson, V Snæbjarnarson… - … Conference on Text …, 2020 - Springer
State-of-the-art machine translation (MT) systems rely on the availability of large parallel
corpora, containing millions of sentence pairs. For the Icelandic language, the parallel …

Learning multilingual and multimodal representations with language-specific encoders and decoders for machine translation

C Escolano Peinado - 2022 - upcommons.upc.edu
This thesis aims to study different language-specific approaches for Multilingual Machine
Translation without parameter sharing and their properties compared to the current state-of …

[PDF][PDF] Customized neural machine translation systems for the Swiss legal domain

R Martínez-Domínguez, M Rikters… - Proceedings of the …, 2020 - aclanthology.org
This paper describes Tilde's work on the development of a Neural Machine Translation
(NMT) platform for Hieronymus, a Switzerland-based boutique legal and financial translation …