[PDF][PDF] Using English baits to catch Serbian multi-word terminology

C Krstev, B Šandrih, R Stanković… - Proceedings of the …, 2018 - aclanthology.org
Proceedings of the Eleventh International Conference on Language …, 2018aclanthology.org
In this paper we present the first results in bilingual terminology extraction. The hypothesis of
our approach is that if for a source language domain terminology exists as well as a domain
aligned corpus for a source and a target language, then it is possible to extract the
terminology for a target language. Our approach relies on several resources and tools:
aligned domain texts, domain terminology for a source language, a terminology extractor for
a target language, and a tool for word and chunk alignment. In this first experiment a source …
Abstract
In this paper we present the first results in bilingual terminology extraction. The hypothesis of our approach is that if for a source language domain terminology exists as well as a domain aligned corpus for a source and a target language, then it is possible to extract the terminology for a target language. Our approach relies on several resources and tools: aligned domain texts, domain terminology for a source language, a terminology extractor for a target language, and a tool for word and chunk alignment. In this first experiment a source language is English, a target language is Serbian, a domain is Library and Information Science for which a bilingual terminological dictionary exists. Our term extractor is based on e-dictionaries and shallow parsing, and for word alignment we use GIZA++. At the end of procedure we included a supervised binary classifier that decides whether an extracted term is a valid domain term. The classifier was evaluated in a 5-fold cross validation setting on a slightly unbalanced dataset, maintaining average F-score of 89%. After conducting the experiment our system extracted 846 different Serbian domain phrases, containing 515 Serbian phrases that were not present in the existing domain terminology.
aclanthology.org
以上显示的是最相近的搜索结果。 查看全部搜索结果