[PDF][PDF] Challenges and issues in developing an annotated corpus and HMM POS tagger for Khasi

MJ Tham - Proceedings of the 15th International Conference on …, 2018 - aclanthology.org
An attempt has been made to annotate a Khasi corpus with Part-of-Speech (POS) tags,
using the Bureau of Indian Standards (BIS) POS tagset prepared by the POS Tag …

A decision tree based word sense disambiguation system in Manipuri language

RL Singh, K Ghosh, K Nongmeikapam… - Advanced …, 2014 - search.proquest.com
This article manifests a primary attempt on building a word sense disambiguation system in
Manipuri language. The article discusses related attempts made in the Manipuri language …

Part-of-speech tagging for mizo language using conditional random field

MVL Nunsanga, P Pakray, C Lallawmsanga… - Computación y …, 2021 - scielo.org.mx
Part of speech (POS) tagging assigns a class or tag to each token in a sentence. The tag
allocated to a word is mainly its part of speech or any other class of interest. Several …

[PDF][PDF] Developing part-of-speech tagger for a resource poor language: Sindhi

R Motlani, H Lalwani, M Shrivastava… - Proceedings of the 7th …, 2015 - ltc.amu.edu.pl
Abstract Sindhi is an Indo-Aryan language spoken by more than 58 million speakers around
the world. It is currently a resource poor language which is harmed by the literature being …

State-of-the-art automatic machine transliteration systems for Indic scripts: a comparative report

BSS Lakshmi, BR Shambhavi - International Journal of …, 2024 - inderscienceonline.com
Due to the proliferation of social media and smart phones, the number of internet users has
increased on a significant scale. As a result of this globalisation, the internet and its users …

Graph Based Imbalanced Multi-Text Classification: A Study on Low Resource Language

L Jimmy, B Arun - 2023 - researchsquare.com
A pre-labeled dataset is required for any machine learning or deep learning tasks in Natural
Language Processing (NLP). Certain languages lack adequate resources, hence impeding …

Embeddings-Based Parallel Corpus Creation for English-Manipuri

G Moirangthem, L Nongbri, N Johny Singh… - … on Communication and …, 2022 - Springer
Lack of parallel corpus is one of the main roadblocks for neural machine translation in Low-
Resource languages. Collecting parallel sentences corpus from web are noisy and requires …

[PDF][PDF] Better sequence labeling for low resource languages using attention and transfer learning

RK Mundotiya - 2022 - idr-lib.iitbhu.ac.in
Since the popular spread of the Internet, natural language content in the form of
suggestions, opinions, news, and tweets, apart from formal documents, has proliferated at …

[PDF][PDF] Analysis and Comparative Study of POS Tagging Techniques for National (Urdu) Language and other Regional Languages of Pakistan

RA Rajper, S Rajper, A Maitlo… - SINDH UNIVERSITY …, 2021 - researchgate.net
Defining algorithms and techniques to enable computers to understand human language is
the Natural Language Processing (NLP), which is an integral part of speech recognition …

[PDF][PDF] Joint Word Segmentation and Part-of-Speech Tagging for Myanmar Language

DL Cing, KM Soe - 2020 - meral.edu.mm
ABSTRACT A lot of research is currently ongoing in word segmentation and POS tagging
developed differently with various methods. Separate word segmenters and POS taggers …