This paper explores and evaluates the effect of different stopword removal and stemming techniques in Urdu IR. The issues are examined from four viewpoints. Is there any …
H Ayral, S Yavuz - … on innovations in intelligent systems and …, 2011 - ieeexplore.ieee.org
We propose an automated method for generating domain specific stop words to improve classification of natural language content. Also we implemented a Bayesian natural …
M Sadeghi, J Vegas - Journal of information science, 2014 - journals.sagepub.com
Stop word identification is one of the most important tasks for many text processing applications such as information retrieval. Stop words occur too frequently in documents in a …
N Tohidi, C Dadkhah, RB Rustamov - The 16th International …, 2020 - researchgate.net
In recent years, Question Answering Systems (QASs) have been known as one of the most significant tools to access information. QASs are search engines that can return a short and …
We explore and evaluate the effect of stopwords in retrieval performance of different Indian languages such as Marathi, Bengali, Gujarati and Sanskrit. The issue was investigated from …
M Dehghani, M Manthouri - 2021 11th International Conference …, 2021 - ieeexplore.ieee.org
A stopword is a word that does not add much semantic information to the text that despite of its very high frequency. Stopwords include prepositions, conjunctions, and pronouns. One of …
SS Sahu, S Pal - ACM Transactions on Asian and Low-Resource …, 2023 - dl.acm.org
We explore and evaluate the effect of different stopword lists (non-corpus-based and corpus- based) in the information retrieval (IR) tasks with different Indian languages such as Bengali …
TV Asubiaro - International Journal of Computer and Information …, 2013 - researchgate.net
This research employed entropy based algorithm to identify stopwords candidate for Yoruba Language texts. Two sets of corpus of 756,039 Yoruba words were used; the diacritized and …
By the advent of new information resources, search engines have encountered a new challenge since they have been obliged to store a large amount of text materials. This is …