查看文章

aclanthology.org 中的 [PDF]

Word embeddings for code-mixed language processing

作者

Adithya Pratapa, Monojit Choudhury, Sunayana Sitaram

发表日期

2018

研讨会论文

Proceedings of the 2018 conference on empirical methods in natural language processing

页码范围

3067-3072

简介

We compare three existing bilingual word embedding approaches, and a novel approach of training skip-grams on synthetic code-mixed text generated through linguistic models of code-mixing, on two tasks-sentiment analysis and POS tagging for code-mixed text. Our results show that while CVM and CCA based embeddings perform as well as the proposed embedding technique on semantic and syntactic tasks respectively, the proposed approach provides the best performance for both tasks overall. Thus, this study demonstrates that existing bilingual embedding techniques are not ideal for code-mixed text processing and there is a need for learning multilingual word embedding from the code-mixed text.

引用总数

被引用次数：69

2019202020212022202320246 17 13 15 10 8

学术搜索中的文章

Word embeddings for code-mixed language processing

A Pratapa, M Choudhury, S Sitaram - Proceedings of the 2018 conference on empirical …, 2018

被引用次数：69 相关文章所有 2 个版本