[HTML][HTML] Progress in machine translation

H Wang, H Wu, Z He, L Huang, KW Church - Engineering, 2022 - Elsevier
After more than 70 years of evolution, great achievements have been made in machine
translation. Especially in recent years, translation quality has been greatly improved with the …

Curriculum learning: A survey

P Soviany, RT Ionescu, P Rota, N Sebe - International Journal of …, 2022 - Springer
Training machine learning models in a meaningful order, from the easy samples to the hard
ones, using curriculum learning can provide performance improvements over the standard …

A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

Redistributing low-frequency words: Making the most of monolingual data in non-autoregressive translation

L Ding, L Wang, S Shi, D Tao, Z Tu - … of the 60th Annual Meeting of …, 2022 - aclanthology.org
Abstract Knowledge distillation (KD) is the preliminary step for training non-autoregressive
translation (NAT) models, which eases the training of NAT models at the cost of losing …

Glancing transformer for non-autoregressive neural machine translation

L Qian, H Zhou, Y Bao, M Wang, L Qiu… - arXiv preprint arXiv …, 2020 - arxiv.org
Recent work on non-autoregressive neural machine translation (NAT) aims at improving the
efficiency by parallel decoding without sacrificing the quality. However, existing NAT …

A survey of non-autoregressive neural machine translation

F Li, J Chen, X Zhang - Electronics, 2023 - mdpi.com
Non-autoregressive neural machine translation (NAMT) has received increasing attention
recently in virtue of its promising acceleration paradigm for fast decoding. However, these …

Rejuvenating low-frequency words: Making the most of parallel data in non-autoregressive translation

L Ding, L Wang, X Liu, DF Wong, D Tao… - arXiv preprint arXiv …, 2021 - arxiv.org
Knowledge distillation (KD) is commonly used to construct synthetic data for training non-
autoregressive translation (NAT) models. However, there exists a discrepancy on low …

Fastcorrect: Fast error correction with edit alignment for automatic speech recognition

Y Leng, X Tan, L Zhu, J Xu, R Luo… - Advances in …, 2021 - proceedings.neurips.cc
Error correction techniques have been used to refine the output sentences from automatic
speech recognition (ASR) models and achieve a lower word error rate (WER) than original …

Incorporating bert into parallel sequence decoding with adapters

J Guo, Z Zhang, L Xu, HR Wei… - Advances in Neural …, 2020 - proceedings.neurips.cc
While large scale pre-trained language models such as BERT have achieved great success
on various natural language understanding tasks, how to efficiently and effectively …

Jointly masked sequence-to-sequence model for non-autoregressive neural machine translation

J Guo, L Xu, E Chen - Proceedings of the 58th Annual Meeting of …, 2020 - aclanthology.org
The masked language model has received remarkable attention due to its effectiveness on
various natural language processing tasks. However, few works have adopted this …