Non-autoregressive translation with layer-wise prediction and deep supervision

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

被引用次数：71 相关文章所有 8 个版本

[PDF] arxiv.org

Accelerating transformer inference for translation via parallel decoding

A Santilli, S Severino, E Postolache, V Maiorca… - arXiv preprint arXiv …, 2023 - arxiv.org

Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT).
The community proposed specific network architectures and learning-based methods to …

被引用次数：33 相关文章所有 7 个版本

[PDF] mdpi.com

A survey of non-autoregressive neural machine translation

F Li, J Chen, X Zhang - Electronics, 2023 - mdpi.com

Non-autoregressive neural machine translation (NAMT) has received increasing attention
recently in virtue of its promising acceleration paradigm for fast decoding. However, these …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Dinoiser: Diffused conditional sequence learning by manipulating noises

J Ye, Z Zheng, Y Bao, L Qian, M Wang - arXiv preprint arXiv:2302.10025, 2023 - arxiv.org

While diffusion models have achieved great success in generating continuous signals such
as images and audio, it remains elusive for diffusion models in learning discrete sequence …

被引用次数：22 相关文章所有 3 个版本

[PDF] neurips.cc

Non-monotonic latent alignments for ctc-based non-autoregressive machine translation

C Shao, Y Feng - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Non-autoregressive translation (NAT) models are typically trained with the cross-entropy
loss, which forces the model outputs to be aligned verbatim with the target sentence and will …

被引用次数：20 相关文章所有 6 个版本

[PDF] arxiv.org

CTC alignments improve autoregressive translation

B Yan, S Dalmia, Y Higuchi, G Neubig, F Metze… - arXiv preprint arXiv …, 2022 - arxiv.org

Connectionist Temporal Classification (CTC) is a widely used approach for automatic
speech recognition (ASR) that performs conditionally independent monotonic alignment …

被引用次数：28 相关文章所有 5 个版本

[PDF] neurips.cc

Teacher forcing recovers reward functions for text generation

Y Hao, Y Liu, L Mou - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Reinforcement learning (RL) has been widely used in text generation to alleviate the
exposure bias issue or to utilize non-parallel datasets. The reward function plays an …

被引用次数：11 相关文章所有 5 个版本

[PDF] mlr.press

On the learning of non-autoregressive transformers

F Huang, T Tao, H Zhou, L Li… - … Conference on Machine …, 2022 - proceedings.mlr.press

Non-autoregressive Transformer (NAT) is a family of text generation models, which aims to
reduce the decoding latency by predicting the whole sentences in parallel. However, such …

被引用次数：15 相关文章所有 5 个版本

[PDF] arxiv.org

On the opportunities of green computing: A survey

Y Zhou, X Lin, X Zhang, M Wang, G Jiang, H Lu… - arXiv preprint arXiv …, 2023 - arxiv.org

Artificial Intelligence (AI) has achieved significant advancements in technology and research
with the development over several decades, and is widely used in many areas including …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

One reference is not enough: Diverse distillation with reference selection for non-autoregressive translation

C Shao, X Wu, Y Feng - arXiv preprint arXiv:2205.14333, 2022 - arxiv.org

Non-autoregressive neural machine translation (NAT) suffers from the multi-modality
problem: the source sentence may have multiple correct translations, but the loss function is …

被引用次数：20 相关文章所有 4 个版本

高级搜索

QQ 群