Qlora: Efficient finetuning of quantized llms

T Dettmers, A Pagnoni, A Holtzman… - Advances in Neural …, 2024 - proceedings.neurips.cc
We present QLoRA, an efficient finetuning approach that reduces memory usage enough to
finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit …

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

[HTML][HTML] Theory of mind and preference learning at the interface of cognitive science, neuroscience, and AI: A review

C Langley, BI Cirstea, F Cuzzolin… - Frontiers in artificial …, 2022 - frontiersin.org
Theory of Mind (ToM)-the ability of the human mind to attribute mental states to others-is a
key component of human cognition. In order to understand other people's mental states or …

Theory of mind might have spontaneously emerged in large language models

M Kosinski - arXiv preprint arXiv:2302.02083, 2023 - arxiv.org
Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central
to human social interactions, communication, empathy, self-consciousness, and morality …

Recent advances in natural language inference: A survey of benchmarks, resources, and approaches

S Storks, Q Gao, JY Chai - arXiv preprint arXiv:1904.01172, 2019 - arxiv.org
In the NLP community, recent years have seen a surge of research activities that address
machines' ability to perform deep language understanding which goes beyond what is …

Experience grounds language

Y Bisk, A Holtzman, J Thomason, J Andreas… - arXiv preprint arXiv …, 2020 - arxiv.org
Language understanding research is held back by a failure to relate language to the
physical world it describes and to the social interactions it facilitates. Despite the incredible …

Clever hans or neural theory of mind? stress testing social reasoning in large language models

N Shapira, M Levy, SH Alavi, X Zhou, Y Choi… - arXiv preprint arXiv …, 2023 - arxiv.org
The escalating debate on AI's capabilities warrants developing reliable metrics to assess
machine" intelligence". Recently, many anecdotal examples were used to suggest that …

Socialiqa: Commonsense reasoning about social interactions

M Sap, H Rashkin, D Chen, R LeBras… - arXiv preprint arXiv …, 2019 - arxiv.org
We introduce Social IQa, the first largescale benchmark for commonsense reasoning about
social situations. Social IQa contains 38,000 multiple choice questions for probing emotional …

Minding language models'(lack of) theory of mind: A plug-and-play multi-character belief tracker

M Sclar, S Kumar, P West, A Suhr, Y Choi… - arXiv preprint arXiv …, 2023 - arxiv.org
Theory of Mind (ToM) $\unicode {x2014} $ the ability to reason about the mental states of
other people $\unicode {x2014} $ is a key element of our social intelligence. Yet, despite …

A fine-grained comparison of pragmatic language understanding in humans and language models

J Hu, S Floyd, O Jouravlev, E Fedorenko… - arXiv preprint arXiv …, 2022 - arxiv.org
Pragmatics and non-literal language understanding are essential to human communication,
and present a long-standing challenge for artificial language models. We perform a fine …