A survey on Urdu and Urdu like language stemmers and stemming techniques

A Jabbar, S Iqbal, MUG Khan, S Hussain - Artificial Intelligence Review, 2018 - Springer
Stemming is one of the basic steps in natural language processing applications such as
information retrieval, parts of speech tagging, syntactic parsing and machine translation, etc …

Pattern based comprehensive urdu stemmer and short text classification

M Ali, S Khalid, MH Aslam - IEEE Access, 2017 - ieeexplore.ieee.org
Urdu language is used by approximately 200 million people for spoken and written
communications. The bulk of unstructured Urdu textual data is available in the world. We can …

[PDF][PDF] An unsupervised approach to develop stemmer

MS Husain - International Journal on Natural Language Computing …, 2012 - academia.edu
This paper presents an unsupervised approach for the development of a stemmer (For the
case of Urdu & Marathi language). Especially, during last few years, a wide range of …

Morphologically annotated Amharic text corpora

T Yeshambel, J Mothe, Y Assabie - … of the 44th International ACM SIGIR …, 2021 - dl.acm.org
In information retrieval (IR), documents that match the query are retrieved. Search engines
usually conflate word variants into a common stem when indexing documents because …

[PDF][PDF] Template Based Affix Stemmer for a Morphologically Rich Language.

S Khan, W Anwar, U Bajwa, X Wang - International Arab Journal of …, 2015 - ccis2k.org
Word stemming is one of the most significant factors that affect the performance of a Natural
Language Processing (NLP) application such as Information Retrieval (IR) system, part of …

Improving text classification performance using PCA and recall-precision criteria

M Zahedi, AG Sorkhi - Arabian Journal for Science and Engineering, 2013 - Springer
Persian text is usually associated with a wide range of important or useless features. This is
the main reason why feature extraction process is one of the difficult tasks in the field of …

Challenges in urdu stemming (a progress report)

K Riaz - BCS IRSG Symposium: Future Directions in Information …, 2007 - scienceopen.com
This paper explains the challenges pertaining to Urdu stemming and presents a rule-based
prototype with a few rules implemented for Urdu to motivate the intricacies. It shows that …

Concept search in Urdu

K Riaz - Proceedings of the 2nd PhD workshop on Information …, 2008 - dl.acm.org
This paper describes a thesis proposal to do concept search in non English and non
European languages. Urdu is chosen as an example language because of its unique …

Evaluation of PerStem: a simple and efficient stemming algorithm for Persian

AH Jadidinejad, F Mahmoudi, J Dehdari - Workshop of the Cross …, 2009 - Springer
Persian is a challenging language in the field of NLP. Right-to-left orthography, complex
morphology, complicated grammatical rules, and different forms of letters make it an …

[PDF][PDF] Challenges in developing a rule based urdu stemmer

SA Khan, W Anwar, UI Bajwa - … of the 2nd Workshop on South …, 2011 - aclanthology.org
Urdu language raises several challenges to Natural Language Processing (NLP) largely
due to its rich morphology. In this language, morphological processing becomes particularly …