A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …

[HTML][HTML] Neural machine translation: A review of methods, resources, and tools

Z Tan, S Wang, Z Yang, G Chen, X Huang, M Sun… - AI Open, 2020 - Elsevier
Abstract Machine translation (MT) is an important sub-field of natural language processing
that aims to translate natural languages using computers. In recent years, end-to-end neural …

Token-level self-evolution training for sequence-to-sequence learning

K Peng, L Ding, Q Zhong, Y Ouyang… - Proceedings of the …, 2023 - aclanthology.org
Adaptive training approaches, widely used in sequence-to-sequence models, commonly
reweigh the losses of different target tokens based on priors, eg word frequency. However …

Wait-info policy: Balancing source and target at information level for simultaneous machine translation

S Zhang, S Guo, Y Feng - arXiv preprint arXiv:2210.11220, 2022 - arxiv.org
Simultaneous machine translation (SiMT) outputs the translation while receiving the source
inputs, and hence needs to balance the received source information and translated target …

CLIO: Role-interactive multi-event head attention network for document-level event extraction

Y Ren, Y Cao, F Fang, P Guo, Z Lin… - Proceedings of the 29th …, 2022 - aclanthology.org
Transforming the large amounts of unstructured text on the Internet into structured event
knowledge is a critical, yet unsolved goal of NLP, especially when addressing document …

Improving neural machine translation with latent features feedback

Y Li, J Li, M Zhang - Neurocomputing, 2021 - Elsevier
Most state-of-the-art neural machine translation (NMT) models progressively encode feature
representation in a bottom-up feed-forward fashion. This traditional encoding mechanism …

The great misalignment problem in human evaluation of NLP methods

M Hämäläinen, K Alnajjar - arXiv preprint arXiv:2104.05361, 2021 - arxiv.org
We outline the Great Misalignment Problem in natural language processing research, this
means simply that the problem definition is not in line with the method proposed and the …

Grammatically derived factual relation augmented neural machine translation

F Li, J Zhu, H Yan, Z Zhang - Applied Sciences, 2022 - mdpi.com
Featured Application This paper introduces factual relation information into Transformer-
based neural machine translation to improve translation quality. Abstract Transformer-based …

Machine Translation of Electrical Terminology Constraints

Z Wang, Y Chen, J Zhang - Information, 2023 - mdpi.com
In practical applications, the accuracy of domain terminology translation is an important
criterion for the performance evaluation of domain machine translation models. Aiming at the …

Revisiting knowledge distillation for autoregressive language models

Q Zhong, L Ding, L Shen, J Liu, B Du, D Tao - arXiv preprint arXiv …, 2024 - arxiv.org
Knowledge distillation (KD) is a common approach to compress a teacher model to reduce
its inference cost and memory footprint, by training a smaller student model. However, in the …