Data harmonization for heterogeneous datasets: a systematic literature review

G Kumar, S Basri, AA Imam, SA Khowaja, LF Capretz… - Applied Sciences, 2021 - mdpi.com
As data size increases drastically, its variety also increases. Investigating such
heterogeneous data is one of the most challenging tasks in information management and …

A review of challenges in machine learning based automated hate speech detection

A Velankar, H Patil, R Joshi - arXiv preprint arXiv:2209.05294, 2022 - arxiv.org
The spread of hate speech on social media space is currently a serious issue. The
undemanding access to the enormous amount of information being generated on these …

PICT@ DravidianLangTech-ACL2022: Neural machine translation on dravidian languages

A Vyawahare, R Tangsali, A Mandke, O Litake… - arXiv preprint arXiv …, 2022 - arxiv.org
This paper presents a summary of the findings that we obtained based on the shared task on
machine translation of Dravidian languages. We stood first in three of the five sub-tasks …

Building a Part of Speech tagger for the Tamil Language

K Sarveswaran, G Dias - 2021 International Conference on …, 2021 - ieeexplore.ieee.org
Identifying the lexical category or Part of Speech (POS) of words is critical for developing
Natural Language Processing (NLP) systems. Existing Tamil POS taggers are either not …

H-AES: towards automated essay scoring for hindi

S Singh, A Pupneja, S Mital, C Shah… - Proceedings of the …, 2023 - ojs.aaai.org
Abstract The use of Natural Language Processing (NLP) for Automated Essay Scoring (AES)
has been well explored in the English language, with benchmark models exhibiting …

A deep neural network-based approach for fake news detection in regional language

P Katariya, V Gupta, R Arora, A Kumar… - International Journal of …, 2022 - emerald.com
Purpose The current natural language processing algorithms are still lacking in judgment
criteria, and these approaches often require deep knowledge of political or social contexts …

Poorvi@ dravidianlangtech: Sentiment analysis on code-mixed tulu and tamil corpus

P Shetty - Proceedings of the Third Workshop on Speech and …, 2023 - aclanthology.org
Sentiment analysis in code-mixed languages poses significant challenges, particularly for
highly under-resourced languages such as Tulu and Tamil. Existing corpora, primarily …

[PDF][PDF] A Bibliometric Perspective of Regional Languages on Select Scholarly Articles.

G Yumnam, CI Singh - DESIDOC Journal of Library & Information …, 2024 - researchgate.net
Regional languages are spoken within a specific geographical area or by a particular ethnic
group and may have official recognition or be used informally. Research on regional …

A Comprehensive Survey on Handwritten Gujarati Character and Its Modifier Recognition Methods

PD Doshi, PA Vanjara - … for Competitive Strategies (ICTCS 2020) ICT …, 2021 - Springer
In India, handwritten character recognition is becoming necessity regionalwise due to new
education policy 2020. Various technologies are applied to solve the problem in this area …

Sanskrit Knowledge-based Systems: Annotation and Computational Tools

H Terdalkar - arXiv preprint arXiv:2406.18276, 2024 - arxiv.org
We address the challenges and opportunities in the development of knowledge systems for
Sanskrit, with a focus on question answering. By proposing a framework for the automated …