查看文章

hal.science 中的 [PDF]

Automatic Selection of Parallel Data for Machine Translation

作者

Despoina Mouratidis, Katia Lida Kermanidis

发表日期

2018/5/25

研讨会论文

IFIP International Conference on Artificial Intelligence Applications and Innovations

卷号

期号

页码范围

146-156

出版商

Springer, Cham

简介

Nowadays machine translation is widely used, but the required data for training, tuning and testing a machine translation engine is often not sufficient or not useful. The automatic selection of data that are qualitatively appropriate for building translation models can help improve translation accuracy. In this paper, we used a large parallel corpus of educational video lecture subtitles as well as text posted by students and lecturers on the course fora. The text is quite challenging to translate due to the scientific domains involved and its informal genre. We applied a random forest classification schema on the output of three machine translation models (one based on statistical machine translation and two on neural machine translation) in order to automatically identify the best output. The unorthodox language phenomena observed as well as the rich-in-terminology scientific domains addressed in the educational …

引用总数

被引用次数：7

2019202020212 2 3

学术搜索中的文章

Automatic selection of parallel data for machine translation

D Mouratidis, KL Kermanidis - IFIP International Conference on Artificial Intelligence …, 2018

被引用次数：7 相关文章所有 4 个版本