Semeval-2020 task 9: Overview of sentiment analysis of code-mixed tweets

P Patwa, G Aguilar, S Kar, S Pandey, S Pykl… - arXiv preprint arXiv …, 2020 - arxiv.org
In this paper, we present the results of the SemEval-2020 Task 9 on Sentiment Analysis of
Code-Mixed Tweets (SentiMix 2020). We also release and describe our Hinglish (Hindi …

Named entity recognition on code-switched data: Overview of the CALCS 2018 shared task

G Aguilar, F AlGhamdi, V Soto, M Diab… - arXiv preprint arXiv …, 2019 - arxiv.org
In the third shared task of the Computational Approaches to Linguistic Code-Switching
(CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social …

Corpus creation for sentiment analysis in code-mixed Tamil-English text

BR Chakravarthi, V Muralidaran… - arXiv preprint arXiv …, 2020 - arxiv.org
Understanding the sentiment of a comment from a video or an image is an essential task in
many applications. Sentiment analysis of a text can be useful for various decision-making …

Demographic dialectal variation in social media: A case study of African-American English

SL Blodgett, L Green, B O'Connor - arXiv preprint arXiv:1608.08868, 2016 - arxiv.org
Though dialectal language is increasingly abundant on social media, few resources exist for
developing NLP tools to handle such language. We conduct a case study of dialectal …

Automatic language identification in texts: A survey

T Jauhiainen, M Lui, M Zampieri, T Baldwin… - Journal of Artificial …, 2019 - jair.org
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …

A survey of current datasets for code-switching research

N Jose, BR Chakravarthi… - 2020 6th …, 2020 - ieeexplore.ieee.org
Code switching is a prevalent phenomenon in the multilingual community and social media
interaction. In the past ten years, we have witnessed an explosion of code switched data in …

Computational sociolinguistics: A survey

D Nguyen, AS Doğruöz, CP Rosé… - Computational …, 2016 - direct.mit.edu
Abstract Language is a social phenomenon and variation is inherent to its social nature.
Recently, there has been a surge of interest within the computational linguistics (CL) …

A sentiment analysis dataset for code-mixed Malayalam-English

BR Chakravarthi, N Jose, S Suryawanshi… - arXiv preprint arXiv …, 2020 - arxiv.org
There is an increasing demand for sentiment analysis of text from social media which are
mostly code-mixed. Systems trained on monolingual data fail for code-mixed data due to the …

GLUECoS: An evaluation benchmark for code-switched NLP

S Khanuja, S Dandapat, A Srinivasan… - arXiv preprint arXiv …, 2020 - arxiv.org
Code-switching is the use of more than one language in the same conversation or utterance.
Recently, multilingual contextual embedding models, trained on multiple monolingual …

Language modeling for code-mixing: The role of linguistic theory based synthetic data

A Pratapa, G Bhat, M Choudhury… - Proceedings of the …, 2018 - aclanthology.org
Training language models for Code-mixed (CM) language is known to be a difficult problem
because of lack of data compounded by the increased confusability due to the presence of …