Lexical paraphrasing and pseudo relevance feedback for biomedical document retrieval

M Wasim, MN Asim, MU Ghani, ZU Rehman… - Multimedia Tools and …, 2019 - Springer
Multimedia Tools and Applications, 2019Springer
Term mismatch is a serious problem effecting the performance of information retrieval
systems. The problem is more severe in biomedical domain where lot of term variations,
abbreviations and synonyms exist. We present query paraphrasing and various term
selection combination techniques to overcome this problem. To perform paraphrasing, we
use noun words to generate synonyms from Metathesaurus. The new synthesized
paraphrases are ranked using statistical information derived from the corpus and relevant …
Abstract
Term mismatch is a serious problem effecting the performance of information retrieval systems. The problem is more severe in biomedical domain where lot of term variations, abbreviations and synonyms exist. We present query paraphrasing and various term selection combination techniques to overcome this problem. To perform paraphrasing, we use noun words to generate synonyms from Metathesaurus. The new synthesized paraphrases are ranked using statistical information derived from the corpus and relevant documents are retrieved based on top n selected paraphrases. We compare the results with state-of-the-art pseudo relevance feedback based retrieval techniques. In quest of enhancing the results of pseudo relevance feedback approach, we introduce two term selection combination techniques namely Borda Count and Intersection. Surprisingly, combinational techniques performed worse than single term selection techniques. In pseudo relevance feedback approach best algorithms are IG, Rochio and KLD which are performing 33%, 30% and 20% better than other techniques respectively. However, the performance of paraphrasing technique is 20% better than pseudo relevance feedback approach.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果