Findings of the 2019 conference on machine translation (WMT19)

L Barrault, O Bojar, MR Costa-Jussa, C Federmann… - 2019 - zora.uzh.ch
This paper presents the results of the premier shared task organized alongside the
Conference on Machine Translation (WMT) 2019. Participants were asked to build machine …

Survey of low-resource machine translation

B Haddow, R Bawden, AVM Barone, J Helcl… - Computational …, 2022 - direct.mit.edu
We present a survey covering the state of the art in low-resource machine translation (MT)
research. There are currently around 7,000 languages spoken in the world and almost all …

Findings of the 2021 conference on machine translation (WMT21)

F Akhbardeh, A Arkhangorodsky, M Biesialska… - Proceedings of the sixth …, 2021 - cris.fbk.eu
This paper presents the results of the news translation task, the multilingual low-resource
translation for Indo-European languages, the triangular translation task, and the automatic …

An efficient transformer decoder with compressed sub-layers

Y Li, Y Lin, T Xiao, J Zhu - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
The large attention-based encoder-decoder network (Transformer) has become prevailing
recently due to its effectiveness. But the high computation complexity of its decoder raises …

ODE transformer: An ordinary differential equation-inspired model for sequence generation

B Li, Q Du, T Zhou, Y Jing, S Zhou, X Zeng… - arXiv preprint arXiv …, 2022 - arxiv.org
Residual networks are an Euler discretization of solutions to Ordinary Differential Equations
(ODE). This paper explores a deeper relationship between Transformer and numerical ODE …

Prompting neural machine translation with translation memories

A Reheman, T Zhou, Y Luo, D Yang, T Xiao… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Improving machine translation (MT) systems with translation memories (TMs) is of great
interest to practitioners in the MT community. However, previous approaches require either a …

Weight distillation: Transferring the knowledge in neural network parameters

Y Lin, Y Li, Z Wang, B Li, Q Du, T Xiao, J Zhu - arXiv preprint arXiv …, 2020 - arxiv.org
Knowledge distillation has been proven to be effective in model acceleration and
compression. It allows a small network to learn to generalize in the same way as a large …

Neural machine translation for the indigenous languages of the Americas: An introduction

M Mager, R Bhatnagar, G Neubig, NT Vu… - arXiv preprint arXiv …, 2023 - arxiv.org
Neural models have drastically advanced state of the art for machine translation (MT)
between high-resource languages. Traditionally, these models rely on large amounts of …

etranslation's submissions to the wmt 2020 news translation task

C Oravecz, K Bontcheva, L Tihanyi… - Proceedings of the …, 2020 - aclanthology.org
The paper describes the submissions of the eTranslation team to the WMT 2020 news
translation shared task. Leveraging the experience from the team's participation last year we …

Hintedbt: Augmenting back-translation with quality and transliteration hints

S Ramnath, M Johnson, A Gupta… - arXiv preprint arXiv …, 2021 - arxiv.org
Back-translation (BT) of target monolingual corpora is a widely used data augmentation
strategy for neural machine translation (NMT), especially for low-resource language pairs …