A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

Accelerating transformer inference for translation via parallel decoding

A Santilli, S Severino, E Postolache, V Maiorca… - arXiv preprint arXiv …, 2023 - arxiv.org
Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT).
The community proposed specific network architectures and learning-based methods to …

A survey of non-autoregressive neural machine translation

F Li, J Chen, X Zhang - Electronics, 2023 - mdpi.com
Non-autoregressive neural machine translation (NAMT) has received increasing attention
recently in virtue of its promising acceleration paradigm for fast decoding. However, these …

Dinoiser: Diffused conditional sequence learning by manipulating noises

J Ye, Z Zheng, Y Bao, L Qian, M Wang - arXiv preprint arXiv:2302.10025, 2023 - arxiv.org
While diffusion models have achieved great success in generating continuous signals such
as images and audio, it remains elusive for diffusion models in learning discrete sequence …

Non-monotonic latent alignments for ctc-based non-autoregressive machine translation

C Shao, Y Feng - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Non-autoregressive translation (NAT) models are typically trained with the cross-entropy
loss, which forces the model outputs to be aligned verbatim with the target sentence and will …

CTC alignments improve autoregressive translation

B Yan, S Dalmia, Y Higuchi, G Neubig, F Metze… - arXiv preprint arXiv …, 2022 - arxiv.org
Connectionist Temporal Classification (CTC) is a widely used approach for automatic
speech recognition (ASR) that performs conditionally independent monotonic alignment …

Teacher forcing recovers reward functions for text generation

Y Hao, Y Liu, L Mou - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Reinforcement learning (RL) has been widely used in text generation to alleviate the
exposure bias issue or to utilize non-parallel datasets. The reward function plays an …

On the learning of non-autoregressive transformers

F Huang, T Tao, H Zhou, L Li… - … Conference on Machine …, 2022 - proceedings.mlr.press
Non-autoregressive Transformer (NAT) is a family of text generation models, which aims to
reduce the decoding latency by predicting the whole sentences in parallel. However, such …

On the opportunities of green computing: A survey

Y Zhou, X Lin, X Zhang, M Wang, G Jiang, H Lu… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial Intelligence (AI) has achieved significant advancements in technology and research
with the development over several decades, and is widely used in many areas including …

One reference is not enough: Diverse distillation with reference selection for non-autoregressive translation

C Shao, X Wu, Y Feng - arXiv preprint arXiv:2205.14333, 2022 - arxiv.org
Non-autoregressive neural machine translation (NAT) suffers from the multi-modality
problem: the source sentence may have multiple correct translations, but the loss function is …