Predicting text preference via structured comparative reasoning

JN Yan, T Liu, J Chiu, J Shen, Z Qin, Y Yu… - Proceedings of the …, 2024 - aclanthology.org
Comparative reasoning plays a crucial role in predicting text preferences; however, large
language models (LLMs) often demonstrate inconsistencies in their reasoning, leading to …

Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

Z Zhang, M Feng, Z Li, C Xu - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com
Abstract Machine learning models can perform well on in-distribution data but often fail on
biased subgroups that are underrepresented in the training data hindering the robustness of …

What could go wrong? discovering and describing failure modes in computer vision

G Csurka, TL Hayes, D Larlus, R Volpi - arXiv preprint arXiv:2408.04471, 2024 - arxiv.org
Deep learning models are effective, yet brittle. Even carefully trained, their behavior tends to
be hard to predict when confronted with out-of-distribution samples. In this work, our goal is …

VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models

L Dunlap, K Mandal, T Darrell, J Steinhardt… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) often exhibit subtle yet distinctive characteristics in their
outputs that users intuitively recognize, but struggle to quantify. These" vibes"--such as tone …

Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models

M Moayeri, V Balachandran, V Chandrasekaran… - arXiv preprint arXiv …, 2024 - arxiv.org
With models getting stronger, evaluations have grown more complex, testing multiple skills
in one benchmark and even in the same instance at once. However, skill-wise performance …

Concept Bottleneck Models Without Predefined Concepts

S Schrodi, J Schur, M Argus, T Brox - arXiv preprint arXiv:2407.03921, 2024 - arxiv.org
There has been considerable recent interest in interpretable concept-based models such as
Concept Bottleneck Models (CBMs), which first predict human-interpretable concepts and …

Trustworthy Transfer Learning: A Survey

J Wu, J He - arXiv preprint arXiv:2412.14116, 2024 - arxiv.org
Transfer learning aims to transfer knowledge or information from a source domain to a
relevant target domain. In this paper, we understand transfer learning from the perspectives …

Bayesian concept bottleneck models with llm priors

J Feng, A Kothari, L Zier, C Singh, YS Tan - arXiv preprint arXiv …, 2024 - arxiv.org
Concept Bottleneck Models (CBMs) have been proposed as a compromise between white-
box and black-box models, aiming to achieve interpretability without sacrificing accuracy …

CAST: Cross-modal Alignment Similarity Test for Vision Language Models

G Dagan, O Loginova, A Batra - arXiv preprint arXiv:2409.11007, 2024 - arxiv.org
Vision Language Models (VLMs) are typically evaluated with Visual Question Answering
(VQA) tasks which assess a model's understanding of scenes. Good VQA performance is …

Explaining Datasets in Words: Statistical Models with Natural Language Parameters

R Zhong, H Wang, D Klein, J Steinhardt - arXiv preprint arXiv:2409.08466, 2024 - arxiv.org
To make sense of massive data, we often fit simplified models and then interpret the
parameters; for example, we cluster the text embeddings and then interpret the mean …