Mauve scores for generative models: Theory and practice

K Pillutla, L Liu, J Thickstun, S Welleck… - Journal of Machine …, 2023 - jmlr.org
Generative artificial intelligence has made significant strides, producing text
indistinguishable from human prose and remarkably photorealistic images. Automatically …

Text augmentation using dataset reconstruction for low-resource classification

A Rahamim, G Uziel, E Goldbraich… - Findings of the …, 2023 - aclanthology.org
In the deployment of real-world text classification models, label scarcity is a common
problem and as the number of classes increases, this problem becomes even more …

Feelings about bodies: Emotions on diet and fitness forums reveal gendered stereotypes and body image concerns

C Sánchez, MD Chu, Z He, R Dorn, S Murray… - arXiv preprint arXiv …, 2024 - arxiv.org
The gendered expectations about ideal body types can lead to body image concerns,
dissatisfaction, and in extreme cases, disordered eating and other psychopathologies …

Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities

MD Chu, Z He, R Dorn, K Lerman - arXiv preprint arXiv:2408.09366, 2024 - arxiv.org
Large language models (LLMs) have shown promise in representing individuals and
communities, offering new ways to study complex social dynamics. However, effectively …

A Model for Quantifying the Degree of Understanding in Cross-domain M2M Semantic Communications

RYT Hou, G Liu, J Fong, H Zhang, SP Jeong - IEEE Access, 2024 - ieeexplore.ieee.org
This paper addresses the problem of semantic communications (SemComs) in intelligent
machine-to-machine (M2M) applications. Although M2M applications may employ other …

Depth : Improving Evaluation of Cross-Domain Text Classification by Measuring Semantic Generalizability

P Seegmiller, J Gatto, SM Preum - arXiv preprint arXiv:2406.14695, 2024 - arxiv.org
Recent evaluations of cross-domain text classification models aim to measure the ability of a
model to obtain domain-invariant performance in a target domain given labeled samples in …

Can You Trust Your Metric? Automatic Concatenation-Based Tests for Metric Validity

ON Fandina, L Choshen, E Farchi, G Kour… - arXiv preprint arXiv …, 2024 - arxiv.org
Consider a scenario where a harmfulness detection metric is employed by a system to filter
unsafe responses generated by a Large Language Model. When analyzing individual …

Characterizing how'distributional'NLP corpora distance metrics are

S Ackerman, G Kour, E Farchi - arXiv preprint arXiv:2310.14829, 2023 - arxiv.org
A corpus of vector-embedded text documents has some empirical distribution. Given two
corpora, we want to calculate a single metric of distance (eg, Mauve, Frechet Inception) …

A ModelOps-Based Framework for Intelligent Medical Knowledge Extraction

H Ding, P Zou, Z Wang, J Zhao… - … on Medical Artificial …, 2023 - ieeexplore.ieee.org
Extracting medical knowledge from healthcare texts enhances downstream tasks like
medical knowledge graph construction and clinical decision-making. However, the …

Reliable and Interpretable Drift Detection in Streams of Short Texts

E Rabinovich, M Vetzler, S Ackerman… - arXiv preprint arXiv …, 2023 - arxiv.org
Data drift is the change in model input data that is one of the key factors leading to machine
learning models performance degradation over time. Monitoring drift helps detecting these …