查看文章

Unsupervised neural machine translation with generative language models only

作者

Jesse Michael Han, Igor Babuschkin, Harrison Edwards, Arvind Neelakantan, Tao Xu, Stanislas Polu, Alex Ray, Pranav Shyam, Aditya Ramesh, Alec Radford, Ilya Sutskever

发表日期

2021/10/11

期刊

arXiv preprint arXiv:2110.05448

简介

We show how to derive state-of-the-art unsupervised neural machine translation systems from generatively pre-trained language models. Our method consists of three steps: few-shot amplification, distillation, and backtranslation. We first use the zero-shot translation ability of large pre-trained language models to generate translations for a small set of unlabeled sentences. We then amplify these zero-shot translations by using them as few-shot demonstrations for sampling a larger synthetic dataset. This dataset is distilled by discarding the few-shot demonstrations and then fine-tuning. During backtranslation, we repeatedly generate translations for a set of inputs and then fine-tune a single language model on both directions of the translation task at once, ensuring cycle-consistency by swapping the roles of gold monotext and generated translations when fine-tuning. By using our method to leverage GPT-3's zero-shot translation capability, we achieve a new state-of-the-art in unsupervised translation on the WMT14 English-French benchmark, attaining a BLEU score of 42.1.

引用总数

被引用次数：23

202020212022202320241 10 7 5

学术搜索中的文章

Unsupervised neural machine translation with generative language models only

JM Han, I Babuschkin, H Edwards, A Neelakantan… - arXiv preprint arXiv:2110.05448, 2021

被引用次数：23 相关文章所有 3 个版本