Fine-tuning by curriculum learning for non-autoregressive neural machine translation

H Wang, H Wu, Z He, L Huang, KW Church - Engineering, 2022 - Elsevier

After more than 70 years of evolution, great achievements have been made in machine
translation. Especially in recent years, translation quality has been greatly improved with the …

被引用次数：187 相关文章所有 2 个版本

[PDF] arxiv.org

Curriculum learning: A survey

P Soviany, RT Ionescu, P Rota, N Sebe - International Journal of …, 2022 - Springer

Training machine learning models in a meaningful order, from the easy samples to the hard
ones, using curriculum learning can provide performance improvements over the standard …

被引用次数：356 相关文章所有 10 个版本

[PDF] arxiv.org

A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

被引用次数：82 相关文章所有 8 个版本

[PDF] aclanthology.org

Redistributing low-frequency words: Making the most of monolingual data in non-autoregressive translation

L Ding, L Wang, S Shi, D Tao, Z Tu - … of the 60th Annual Meeting of …, 2022 - aclanthology.org

Abstract Knowledge distillation (KD) is the preliminary step for training non-autoregressive
translation (NAT) models, which eases the training of NAT models at the cost of losing …

被引用次数：60 相关文章所有 2 个版本

[PDF] arxiv.org

Glancing transformer for non-autoregressive neural machine translation

L Qian, H Zhou, Y Bao, M Wang, L Qiu… - arXiv preprint arXiv …, 2020 - arxiv.org

Recent work on non-autoregressive neural machine translation (NAT) aims at improving the
efficiency by parallel decoding without sacrificing the quality. However, existing NAT …

被引用次数：146 相关文章所有 10 个版本

[PDF] mdpi.com

A survey of non-autoregressive neural machine translation

F Li, J Chen, X Zhang - Electronics, 2023 - mdpi.com

Non-autoregressive neural machine translation (NAMT) has received increasing attention
recently in virtue of its promising acceleration paradigm for fast decoding. However, these …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Rejuvenating low-frequency words: Making the most of parallel data in non-autoregressive translation

L Ding, L Wang, X Liu, DF Wong, D Tao… - arXiv preprint arXiv …, 2021 - arxiv.org

Knowledge distillation (KD) is commonly used to construct synthetic data for training non-
autoregressive translation (NAT) models. However, there exists a discrepancy on low …

被引用次数：64 相关文章所有 6 个版本

[PDF] neurips.cc

Fastcorrect: Fast error correction with edit alignment for automatic speech recognition

Y Leng, X Tan, L Zhu, J Xu, R Luo… - Advances in …, 2021 - proceedings.neurips.cc

Error correction techniques have been used to refine the output sentences from automatic
speech recognition (ASR) models and achieve a lower word error rate (WER) than original …

被引用次数：61 相关文章所有 7 个版本

[PDF] neurips.cc

Incorporating bert into parallel sequence decoding with adapters

J Guo, Z Zhang, L Xu, HR Wei… - Advances in Neural …, 2020 - proceedings.neurips.cc

While large scale pre-trained language models such as BERT have achieved great success
on various natural language understanding tasks, how to efficiently and effectively …

被引用次数：71 相关文章所有 10 个版本

[PDF] aclanthology.org

Jointly masked sequence-to-sequence model for non-autoregressive neural machine translation

J Guo, L Xu, E Chen - Proceedings of the 58th Annual Meeting of …, 2020 - aclanthology.org

The masked language model has received remarkable attention due to its effectiveness on
various natural language processing tasks. However, few works have adopted this …

被引用次数：73 相关文章所有 7 个版本

高级搜索

QQ 群