作者
Gürkan Şahin, Banu Diri, Tuğba Yıldız
发表日期
2016/5/16
研讨会论文
2016 24th Signal Processing and Communication Application Conference (SIU)
页码范围
1037-1040
出版商
IEEE
简介
Extraction of various semantic relation pairs from different sources (dictionary definitions, corpus etc.) with high accuracy is one of the most popular topics in natural language processing (NLP). In this study, a hybrid method is proposed to extract Turkish part-whole pairs from corpus. Corpus statistics, WordNet similarities and Word2Vec word vector similarities are used together in this study. Firstly, initial part-whole seeds are prepared and by using these seeds part-whole patterns are extracted from corpus. For each pattern, a reliability score is calculated and reliable patterns are selected to produce new pairs from corpus. Various reliability scores are used for new pairs. To measure success of method, 19 target whole words are selected and average 83% (first 10 pairs), 74% (first 20 pairs), 68% (first 30 pairs) precisions are obtained, respectively.
引用总数
学术搜索中的文章
G Şahin, B Diri, T Yıldız - 2016 24th Signal Processing and Communication …, 2016