Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning

P Chen, J Wang, H Lin, D Zhao, Z Yang - Bioinformatics, 2023 - academic.oup.com
Motivation Few-shot learning that can effectively perform named entity recognition in low-
resource scenarios has raised growing attention, but it has not been widely studied yet in the …

Multimodal sentiment recognition with multi-task learning

S Zhang, C Yin, Z Yin - IEEE Transactions on Emerging Topics …, 2022 - ieeexplore.ieee.org
Sentiment recognition in social network aims at recognizing the underlying affective states of
user-generated content. The research center of sentiment recognition is moving from pure …

A syntax-guided multi-task learning approach for Turducken-style code generation

G Yang, Y Zhou, X Chen, X Zhang, Y Xu, T Han… - Empirical Software …, 2023 - Springer
Due to the development of pre-trained language models, automated code generation
techniques have shown great promise in recent years. However, the generated code will not …

Machine translation of 16th century letters from Latin to German

L Fischer, P Scheurer, R Schwitter… - Proceedings of the …, 2022 - aclanthology.org
This paper outlines our work in collecting training data for and developing a Latin–German
Neural Machine Translation (NMT) system, for translating 16th century letters. While Latin …

An Efficient Method for Generating Synthetic Data for Low-Resource Machine Translation: An empirical study of Chinese, Japanese to Vietnamese Neural Machine …

TV Ngo, PT Nguyen, VV Nguyen, TL Ha… - Applied Artificial …, 2022 - Taylor & Francis
Data sparsity is one of the challenges for low-resource language pairs in Neural Machine
Translation (NMT). Previous works have presented different approaches for data …

Data augmentation for machine translation via dependency subtree swapping

A Nagy, DP Lakatos, B Barta, P Nanys, J Ács - arXiv preprint arXiv …, 2023 - arxiv.org
We present a generic framework for data augmentation via dependency subtree swapping
that is applicable to machine translation. We extract corresponding subtrees from the …

Grammar-based Data Augmentation for Low-Resource Languages: The Case of Guarani-Spanish Neural Machine Translation

A Lucas, A Baladón, V Pardiñas… - Proceedings of the …, 2024 - aclanthology.org
One of the main problems low-resource languages face in NLP can be pictured as a vicious
circle: data is needed to build and test tools, but the available text is scarce and there are not …

TreeSwap: Data Augmentation for Machine Translation via Dependency Subtree Swapping

A Nagy, D Lakatos, B Barta, J Ács - arXiv preprint arXiv:2311.02355, 2023 - arxiv.org
Data augmentation methods for neural machine translation are particularly useful when
limited amount of training data is available, which is often the case when dealing with low …

Revitalizing Bahnaric Language through Neural Machine Translation: Challenges, Strategies, and Promising Outcomes

HNK Vo, DD Le, TMD Phan, TS Nguyen… - Proceedings of the …, 2024 - ojs.aaai.org
The Bahnar, a minority ethnic group in Vietnam with ancient roots, hold a language of deep
cultural and historical significance. The government is prioritizing the preservation and …

Non-fluent synthetic target-language data improve neural machine translation

VM Sánchez-Cartagena, M Esplà-Gomis… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
When the amount of parallel sentences available to train a neural machine translation is
scarce, a common practice is to generate new synthetic training samples from them. A …