Leveraging language identification to enhance code-mixed text classification

G Takawane, A Phaltankar, V Patwardhan… - arXiv preprint arXiv …, 2023 - arxiv.org
The usage of more than one language in the same text is referred to as Code Mixed. It is
evident that there is a growing degree of adaption of the use of code-mixed data, especially …

[HTML][HTML] Language augmentation approach for code-mixed text classification

G Takawane, A Phaltankar, V Patwardhan… - Natural Language …, 2023 - Elsevier
The usage of more than one language in the same text is referred to as Code Mixed. It is
evident that there is a growing degree of adaption of the use of code-mixed data, especially …

Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi

MN Raihan, D Goswami, A Mahmud - arXiv preprint arXiv:2309.10272, 2023 - arxiv.org
One of the most popular downstream tasks in the field of Natural Language Processing is
text classification. Text classification tasks have become more daunting when the texts are …

L3Cube-HingCorpus and HingBERT: A code mixed Hindi-English dataset and BERT language models

R Nayak, R Joshi - arXiv preprint arXiv:2204.08398, 2022 - arxiv.org
Code-switching occurs when more than one language is mixed in a given sentence or a
conversation. This phenomenon is more prominent on social media platforms and its …

Comparative study of pre-trained bert models for code-mixed hindi-english data

A Patil, V Patwardhan, A Phaltankar… - 2023 IEEE 8th …, 2023 - ieeexplore.ieee.org
The term" Code Mixed" refers to the use of more than one language in the same text. This
phenomenon is predominantly observed on social media platforms, with an increasing …

A Comprehensive Understanding of Code-mixed Language Semantics using Hierarchical Transformer

A Sengupta, T Suresh, MS Akhtar… - arXiv preprint arXiv …, 2022 - arxiv.org
Being a popular mode of text-based communication in multilingual communities, code-
mixing in online social media has became an important subject to study. Learning the …

Translate and Classify: Improving Sequence Level Classification for English-Hindi Code-Mixed Data

D Gautam, K Gupta, M Shrivastava - Proceedings of the Fifth …, 2021 - aclanthology.org
Code-mixing is a common phenomenon in multilingual societies around the world and is
especially common in social media texts. Traditional NLP systems, usually trained on …

My boli: Code-mixed marathi-english corpora, pretrained language models and evaluation benchmarks

T Chavan, O Gokhale, A Kane, S Patankar… - arXiv preprint arXiv …, 2023 - arxiv.org
The research on code-mixed data is limited due to the unavailability of dedicated code-
mixed datasets and pre-trained language models. In this work, we focus on the low-resource …

A Comprehensive Understanding of Code-Mixed Language Semantics Using Hierarchical Transformer

T Suresh, A Sengupta, MS Akhtar… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Being a popular mode of text-based communication in multilingual communities, code
mixing in online social media has become an important subject to study. Learning the …

Adapter-based fine-tuning of pre-trained multilingual language models for code-mixed and code-switched text classification

H Rathnayake, J Sumanapala, R Rukshani… - … and Information Systems, 2022 - Springer
Code-mixing and code-switching are frequent features in online conversations.
Classification of such text is challenging if one of the languages is low-resourced. Fine …