Linguistically driven multi-task pre-training for low-resource neural machine translation

Z Mao, C Chu, S Kurohashi - Transactions on Asian and Low-Resource …, 2022 - dl.acm.org
In the present study, we propose novel sequence-to-sequence pre-training objectives for
low-resource machine translation (NMT): Japanese-specific sequence to sequence (JASS) …

EMS: Efficient and Effective Massively Multilingual Sentence Embedding Learning

Z Mao, C Chu, S Kurohashi - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
Massively multilingual sentence representation models, eg, LASER, SBERT-distill, and
LaBSE, help significantly improve cross-lingual downstream tasks. However, the use of a …

Language rehabilitation of people with BROCA aphasia using deep neural machine translation

K Smaïli, D Langlois, P Pribil - Fifth International Conference on …, 2022 - hal.science
More than 13 million people suffer a stroke each year. Aphasia is known as a language
disorder usually caused by a stroke that damages a specific area of the brain that controls …

Softmax tempering for training neural machine translation models

R Dabre, A Fujita - arXiv preprint arXiv:2009.09372, 2020 - arxiv.org
Neural machine translation (NMT) models are typically trained using a softmax cross-
entropy loss where the softmax distribution is compared against smoothed gold labels. In …

Investigating softmax tempering for training neural machine translation models

R Dabre, A Fujita - … of Machine Translation Summit XVIII: Research …, 2021 - aclanthology.org
Neural machine translation (NMT) models are typically trained using a softmax cross-
entropy loss where the softmax distribution is compared against the gold labels. In low …

[HTML][HTML] 基于汉字简繁转换的汉日神经机器翻译数据增强研究

张津一, 高忠辉, 郭聪 - Artificial Intelligence and Robotics Research, 2023 - hanspub.org
本文提出了一种基于汉字简繁转换的神经机器翻译(Neural Machine Translation, NMT)
数据增强方法, 旨在通过利用简繁转换表将源端文字替换为目标端文字, 从而融合汉字简繁转换 …

Combining sequence distillation and transfer learning for efficient low-resource neural machine translation models

R Dabre, A Fujita - Proceedings of the Fifth Conference on …, 2020 - aclanthology.org
In neural machine translation (NMT), sequence distillation (SD) through creation of distilled
corpora leads to efficient (compact and fast) models. However, its effectiveness in extremely …

[PDF][PDF] Breaking Language Barriers: Enhancing Multilingual Representation for Sentence Alignment and Translation

Z Mao - 2024 - repository.kulib.kyoto-u.ac.jp
In a diverse linguistic landscape where over 7,100 languages are spoken, vast swathes of
digital content remain isolated within language silos, creating significant barriers to global …

[PDF][PDF] Self-supervised dynamic programming encoding for neural machine translation

H Song, R Dabre, C Chu, S Kurohashi, E Sumita - 2022 - anlp.jp
Neural machine translation (NMT)[1] is known to give state-of-the-art translations for a
variety of language pairs. Sub-word segmentation [2, 3] is one of the key reasons behind it …