Overview for the first shared task on language identification in code-switched data

P Patwa, G Aguilar, S Kar, S Pandey, S Pykl… - arXiv preprint arXiv …, 2020 - arxiv.org

In this paper, we present the results of the SemEval-2020 Task 9 on Sentiment Analysis of
Code-Mixed Tweets (SentiMix 2020). We also release and describe our Hinglish (Hindi …

被引用次数：179 相关文章所有 7 个版本

[PDF] arxiv.org

Named entity recognition on code-switched data: Overview of the CALCS 2018 shared task

G Aguilar, F AlGhamdi, V Soto, M Diab… - arXiv preprint arXiv …, 2019 - arxiv.org

In the third shared task of the Computational Approaches to Linguistic Code-Switching
(CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social …

被引用次数：91 相关文章所有 8 个版本

[PDF] arxiv.org

Corpus creation for sentiment analysis in code-mixed Tamil-English text

BR Chakravarthi, V Muralidaran… - arXiv preprint arXiv …, 2020 - arxiv.org

Understanding the sentiment of a comment from a video or an image is an essential task in
many applications. Sentiment analysis of a text can be useful for various decision-making …

被引用次数：286 相关文章所有 8 个版本

[PDF] arxiv.org

Demographic dialectal variation in social media: A case study of African-American English

SL Blodgett, L Green, B O'Connor - arXiv preprint arXiv:1608.08868, 2016 - arxiv.org

Though dialectal language is increasingly abundant on social media, few resources exist for
developing NLP tools to handle such language. We conduct a case study of dialectal …

被引用次数：456 相关文章所有 6 个版本

[PDF] jair.org

Automatic language identification in texts: A survey

T Jauhiainen, M Lui, M Zampieri, T Baldwin… - Journal of Artificial …, 2019 - jair.org

Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …

被引用次数：253 相关文章所有 11 个版本

[PDF] mccr.ae

A survey of current datasets for code-switching research

N Jose, BR Chakravarthi… - 2020 6th …, 2020 - ieeexplore.ieee.org

Code switching is a prevalent phenomenon in the multilingual community and social media
interaction. In the past ten years, we have witnessed an explosion of code switched data in …

被引用次数：161 相关文章所有 4 个版本

[PDF] mit.edu

Computational sociolinguistics: A survey

D Nguyen, AS Doğruöz, CP Rosé… - Computational …, 2016 - direct.mit.edu

Abstract Language is a social phenomenon and variation is inherent to its social nature.
Recently, there has been a surge of interest within the computational linguistics (CL) …

被引用次数：294 相关文章所有 19 个版本

[PDF] arxiv.org

A sentiment analysis dataset for code-mixed Malayalam-English

BR Chakravarthi, N Jose, S Suryawanshi… - arXiv preprint arXiv …, 2020 - arxiv.org

There is an increasing demand for sentiment analysis of text from social media which are
mostly code-mixed. Systems trained on monolingual data fail for code-mixed data due to the …

被引用次数：136 相关文章所有 7 个版本

[PDF] arxiv.org

GLUECoS: An evaluation benchmark for code-switched NLP

S Khanuja, S Dandapat, A Srinivasan… - arXiv preprint arXiv …, 2020 - arxiv.org

Code-switching is the use of more than one language in the same conversation or utterance.
Recently, multilingual contextual embedding models, trained on multiple monolingual …

被引用次数：131 相关文章所有 4 个版本

[PDF] aclanthology.org

Language modeling for code-mixing: The role of linguistic theory based synthetic data

A Pratapa, G Bhat, M Choudhury… - Proceedings of the …, 2018 - aclanthology.org

Training language models for Code-mixed (CM) language is known to be a difficult problem
because of lack of data compounded by the increased confusability due to the presence of …

被引用次数：169 相关文章所有 2 个版本

高级搜索

QQ 群