Experimental evaluation of deep learning models for marathi text classification

L3cube-hindbert and devbert: Pre-trained bert transformer models for devanagari based hindi and marathi languages

R Joshi - arXiv preprint arXiv:2211.11418, 2022 - arxiv.org

The monolingual Hindi BERT models currently available on the model hub do not perform
better than the multi-lingual models on downstream tasks. We present L3Cube-HindBERT, a …

被引用次数：43 相关文章所有 3 个版本

[PDF] arxiv.org

L3cube-mahacorpus and mahabert: Marathi monolingual corpus, marathi bert language models, and resources

R Joshi - arXiv preprint arXiv:2202.01159, 2022 - arxiv.org

We present L3Cube-MahaCorpus a Marathi monolingual data set scraped from different
internet sources. We expand the existing Marathi monolingual corpus with 24.8 M sentences …

被引用次数：47 相关文章所有 6 个版本

[PDF] arxiv.org

Mono vs multilingual bert for hate speech detection and text classification: A case study in marathi

A Velankar, H Patil, R Joshi - IAPR Workshop on Artificial Neural Networks …, 2022 - Springer

Transformers are the most eminent architectures used for a vast range of Natural Language
Processing tasks. These models are pre-trained over a large text corpus and are meant to …

被引用次数：37 相关文章所有 8 个版本

[PDF] arxiv.org

Hate and offensive speech detection in hindi and marathi

A Velankar, H Patil, A Gore, S Salunke… - arXiv preprint arXiv …, 2021 - arxiv.org

Sentiment analysis is the most basic NLP task to determine the polarity of text data. There
has been a significant amount of work in the area of multilingual text as well. Still hate and …

被引用次数：45 相关文章所有 5 个版本

[PDF] arxiv.org

L3cube-mahahate: A tweet-based marathi hate speech detection dataset and bert models

A Velankar, H Patil, A Gore, S Salunke… - arXiv preprint arXiv …, 2022 - arxiv.org

Social media platforms are used by a large number of people prominently to express their
thoughts and opinions. However, these platforms have contributed to a substantial amount …

被引用次数：35 相关文章所有 7 个版本

[PDF] researchgate.net

An unsupervised annotation of Arabic texts using multi-label topic modeling and genetic algorithm

HA Almuzaini, AM Azmi - Expert Systems with Applications, 2022 - Elsevier

Every day the world produces an enormous amount of textual data. This unstructured text is
of little use unless it is labeled using a combination of categories, keywords, tags. Humans …

被引用次数：18 相关文章所有 4 个版本

[PDF] arxiv.org

L3cube-mahanlp: Marathi natural language processing datasets, models, and library

R Joshi - arXiv preprint arXiv:2205.14728, 2022 - arxiv.org

Despite being the third most popular language in India, the Marathi language lacks useful
NLP resources. Moreover, popular NLP libraries do not have support for the Marathi …

被引用次数：21 相关文章所有 3 个版本

[PDF] arxiv.org

Comparative study of long document classification

V Wagh, S Khandve, I Joshi, A Wani… - TENCON 2021-2021 …, 2021 - ieeexplore.ieee.org

The amount of information stored in the form of documents on the internet has been
increasing rapidly. Thus it has become a necessity to organize and maintain these …

被引用次数：31 相关文章所有 6 个版本

[PDF] academia.edu

A survey on NLP resources, tools, and techniques for Marathi language processing

P Lahoti, N Mittal, G Singh - ACM Transactions on Asian and Low …, 2022 - dl.acm.org

Natural Language Processing (NLP) has been in practice for the past couple of decades,
and extensive work has been done for the Western languages, particularly the English …

被引用次数：15 相关文章所有 2 个版本

[PDF] aclanthology.org

L3cube-mahaner: A marathi named entity recognition dataset and bert models

O Litake, MR Sabane, PS Patil… - Proceedings of the …, 2022 - aclanthology.org

Abstract Named Entity Recognition (NER) is a basic NLP task and finds major applications
in conversational and search systems. It helps us identify key entities in a sentence used for …

被引用次数：14 相关文章

高级搜索

QQ 群