GPTEval: A survey on assessments of ChatGPT and GPT-4

R Mao, G Chen, X Zhang, F Guerin… - arXiv preprint arXiv …, 2023 - arxiv.org
The emergence of ChatGPT has generated much speculation in the press about its potential
to disrupt social and economic systems. Its astonishing language ability has aroused strong …

A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics

K He, R Mao, Q Lin, Y Ruan, X Lan, M Feng… - arXiv preprint arXiv …, 2023 - arxiv.org
The utilization of large language models (LLMs) in the Healthcare domain has generated
both excitement and concern due to their ability to effectively respond to freetext queries with …

Aya dataset: An open-access collection for multilingual instruction tuning

S Singh, F Vargus, D Dsouza, BF Karlsson… - arXiv preprint arXiv …, 2024 - arxiv.org
Datasets are foundational to many breakthroughs in modern artificial intelligence. Many
recent achievements in the space of natural language processing (NLP) can be attributed to …

Understanding and mitigating language confusion in llms

K Marchisio, WY Ko, A Bérard, T Dehaze… - arXiv preprint arXiv …, 2024 - arxiv.org
We investigate a surprising limitation of LLMs: their inability to consistently generate text in a
user's desired language. We create the Language Confusion Benchmark (LCB) to evaluate …

LLMs Are Few-Shot In-Context Low-Resource Language Learners

S Cahyawijaya, H Lovenia, P Fung - arXiv preprint arXiv:2403.16512, 2024 - arxiv.org
In-context learning (ICL) empowers large language models (LLMs) to perform diverse tasks
in underrepresented languages using only short in-context information, offering a crucial …

Multilingualism and mismatching: Spanish language usage in college admissions essays

AJ Alvero, R Pattichis - Poetics, 2024 - Elsevier
In US K-12 education, the Spanish language is subject to practices and policies that limit its
expression, especially among racialized Latinx students. However, higher education claims …

MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English Clinical Queries

A Ghosh, A Acharya, P Jha, S Saha… - … on Information Retrieval, 2024 - Springer
In the healthcare domain, summarizing medical questions posed by patients is critical for
improving doctor-patient interactions and medical decision-making. Although medical data …

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

E Fleisig, G Smith, M Bossi, I Rustagi, X Yin… - arXiv preprint arXiv …, 2024 - arxiv.org
We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects
of English (Standard American English, Standard British English, and eight widely spoken …

Marathi-english code-mixed text generation

D Amin, S Govilkar, S Kulkarni, YS Lalit… - arXiv preprint arXiv …, 2023 - arxiv.org
Code-mixing, the blending of linguistic elements from distinct languages to form meaningful
sentences, is common in multilingual settings, yielding hybrid languages like Hinglish and …

Multilingual Identification of English Code-Switching

I Sterner - Proceedings of the Eleventh Workshop on NLP for …, 2024 - aclanthology.org
This work addresses the task of identifying English code-switching in multilingual text. We
train two token-level classifiers on data of high-resource language pairs. The first …