Y Wen, B Shayegh, C Huang, Y Cao, L Mou - arXiv preprint arXiv …, 2024 - arxiv.org
The ability of zero-shot translation emerges when we train a multilingual model with certain translation directions; the model can then directly translate in unseen directions …
S Steingrímsson - Proceedings of the Eighth Conference on …, 2023 - aclanthology.org
This paper describes the AST submission to the WMT23 Shared Task on Parallel Data Curation. We experiment with two approaches for curating data from the provided web …
Lexical simplification (LS) methods based on pretrained language models have made remarkable progress, generating potential substitutes for a complex word through analysis …
When parallel corpora are preprocessed for machine translation (MT) training, a part of the parallel data is commonly discarded and deemed non-parallel due to odd-length ratio …
For machine translation (MT) systems to produce accurate and fluent translations, reliable parallel corpora are key. Errors, due to misalignments or inadequate filtering during …
This thesis describes an unsupervised approach to determine the translation direction for parallel texts. Traditional methods in this field rely on large amounts of homogeneous …
Jannis Vamvas - My PhD Thesis Is Out! (A Summary) My PhD Thesis Is Out! (A Summary) April 05, 2023 Avatar Jannis Vamvas NLP Researcher, University of Zurich After quite a few months …