Automatic language identification in texts: A survey

T Jauhiainen, M Lui, M Zampieri, T Baldwin… - Journal of Artificial …, 2019 - jair.org
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …

Supervised learning methods for bangla web document categorization

AK Mandal, R Sen - arXiv preprint arXiv:1410.2045, 2014 - arxiv.org
This paper explores the use of machine learning approaches, or more specifically, four
supervised learning Methods, namely Decision Tree (C 4.5), K-Nearest Neighbour (KNN) …

Bangla news classification using naive Bayes classifier

AN Chy, MH Seddiqui, S Das - 16th Int'l Conf. Computer and …, 2014 - ieeexplore.ieee.org
Web is gigantic and being constantly update. Bangla news in web are rapidly grown in the
era of information age where each news site has its own different layout and categorization …

A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods

A Zhang, J Feng, RD Ward, P Wan, Q Gao, J Wu… - PLoS …, 2012 - journals.plos.org
Species identification via DNA barcodes is contributing greatly to current bioinventory efforts.
The initial, and widely accepted, proposal was to use the protein-coding cytochrome c …

Word completion and sequence prediction in Bangla language using trie and a hybrid approach of sequential LSTM and N-gram

S Sarker, ME Islam, JR Saurav… - 2020 2nd International …, 2020 - ieeexplore.ieee.org
Autocompletion and sequence prediction is the basis of any assistance systems. When we
step out to type something, it is much comforting to get a suggestion of the next word or even …

[PDF][PDF] Semi-supervised priors for microblog language identification

S Carter, M Tsagkias, W Weerkamp - Dutch-Belgian information …, 2011 - dare.uva.nl
Offering access to information in microblog posts requires successful language
identification. Language identification on sparse and noisy data can be challenging. In this …

[PDF][PDF] Language identification in texts

T Jauhiainen - 2019 - helda.helsinki.fi
This work investigates the task of identifying the language of digitally encoded text.
Automatic methods for language identification have been developed since the 1960s …

Features and Methods

T Jauhiainen, M Zampieri, T Baldwin… - … Language Identification in …, 2024 - Springer
In addition to features and methods used in LI, this chapter introduces the notation devised
by Jauhiainen et al. that is used throughout this book to describe LI methods. For easier …

Develop a neural model to score bigram of words using bag-of-words model for sentiment analysis

A Balamurali, B Ananthanarayanan - Neural Networks for Natural …, 2020 - igi-global.com
A Bag-of-Words model is widely used to extract the features from text, which is given as input
to machine learning algorithm like MLP, neural network. The dataset considered is movie …

Neural network model for semantic analysis of Sanskrit text

S Selot, N Tripathi, AS Zadgaonkar - International Journal of Natural …, 2018 - igi-global.com
Semantic analysis is the process of extracting meaning of the sentence, from a given
language. From the perspective of computer processing, challenge lies in making computer …