Domain adaptation and multi-domain adaptation for neural machine translation: A survey

D Saunders - Journal of Artificial Intelligence Research, 2022 - jair.org
The development of deep learning techniques has allowed Neural Machine Translation
(NMT) models to become extremely powerful, given sufficient training data and training time …

Learning to generalize to more: Continuous semantic augmentation for neural machine translation

X Wei, H Yu, Y Hu, R Weng, W Luo, J Xie… - arXiv preprint arXiv …, 2022 - arxiv.org
The principal task in supervised neural machine translation (NMT) is to learn to generate
target sentences conditioned on the source inputs from a set of parallel sentence pairs, and …

Shallow-to-deep training for neural machine translation

B Li, Z Wang, H Liu, Y Jiang, Q Du, T Xiao… - arXiv preprint arXiv …, 2020 - arxiv.org
Deep encoders have been proven to be effective in improving neural machine translation
(NMT) systems, but training an extremely deep encoder is time consuming. Moreover, why …

Learning light-weight translation models from deep transformer

B Li, Z Wang, H Liu, Q Du, T Xiao, C Zhang… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Recently, deep models have shown tremendous improvements in neural machine
translation (NMT). However, systems of this kind are computationally expensive and memory …

ODE transformer: An ordinary differential equation-inspired model for sequence generation

B Li, Q Du, T Zhou, Y Jing, S Zhou, X Zeng… - arXiv preprint arXiv …, 2022 - arxiv.org
Residual networks are an Euler discretization of solutions to Ordinary Differential Equations
(ODE). This paper explores a deeper relationship between Transformer and numerical ODE …

Deep Transformer modeling via grouping skip connection for neural machine translation

Y Li, J Li, M Zhang - Knowledge-based systems, 2021 - Elsevier
Most of the deep neural machine translation (NMT) models are based on a bottom-up
feedforward fashion, in which representations in low layers construct or modulate high …

Gtrans: Grouping and fusing transformer layers for neural machine translation

J Yang, Y Yin, L Yang, S Ma, H Huang… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
Transformer structure, stacked by a sequence of encoder and decoder network layers,
achieves significant development in neural machine translation. However, vanilla …

Towards enhancing faithfulness for neural machine translation

R Weng, H Yu, X Wei, W Luo - Proceedings of the 2020 …, 2020 - aclanthology.org
Neural machine translation (NMT) has achieved great success due to the ability to generate
high-quality sentences. Compared with human translations, one of the drawbacks of current …

PromptST: Abstract Prompt Learning for End-to-End Speech Translation

T Yu, L Ding, X Liu, K Chen, M Zhang… - Proceedings of the …, 2023 - aclanthology.org
An end-to-end speech-to-text (S2T) translation model is usually initialized from a pre-trained
speech recognition encoder and a pre-trained text-to-text (T2T) translation decoder …

Deep transformers with latent depth

X Li, A Cooper Stickland, Y Tang… - Advances in Neural …, 2020 - proceedings.neurips.cc
The Transformer model has achieved state-of-the-art performance in many sequence
modeling tasks. However, how to leverage model capacity with large or variable depths is …