Combating misinformation in the age of llms: Opportunities and challenges

C Chen, K Shu - AI Magazine, 2024 - Wiley Online Library
Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …

Building machines that learn and think with people

KM Collins, I Sucholutsky, U Bhatt, K Chandra… - Nature human …, 2024 - nature.com
What do we want from machine intelligence? We envision machines that are not just tools
for thought but partners in thought: reasonable, insightful, knowledgeable, reliable and …

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arXiv preprint arXiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

Contrastive learning explains the emergence and function of visual category-selective regions

JS Prince, GA Alvarez, T Konkle - Science Advances, 2024 - science.org
Modular and distributed coding theories of category selectivity along the human ventral
visual stream have long existed in tension. Here, we present a reconciling framework …

A large-scale examination of inductive biases shaping high-level visual representation in brains and machines

C Conwell, JS Prince, KN Kay, GA Alvarez… - Nature …, 2024 - nature.com
The rapid release of high-performing computer vision models offers new potential to study
the impact of different inductive biases on the emergent brain alignment of learned …

Do large language models perform the way people expect? measuring the human generalization function

K Vafa, A Rambachan, S Mullainathan - arXiv preprint arXiv:2406.01382, 2024 - arxiv.org
What makes large language models (LLMs) impressive is also what makes them hard to
evaluate: their diversity of uses. To evaluate these models, we must understand the …

[HTML][HTML] An explainable AI framework for robust and transparent data-driven wind turbine power curve models

S Letzgus, KR Müller - Energy and AI, 2024 - Elsevier
In recent years, increasingly complex machine learning methods have become state-of-the-
art in modelling wind turbine power curves based on operational data. While these methods …

Mechanistic Interpretability for AI Safety--A Review

L Bereska, E Gavves - arXiv preprint arXiv:2404.14082, 2024 - arxiv.org
Understanding AI systems' inner workings is critical for ensuring value alignment and safety.
This review explores mechanistic interpretability: reverse-engineering the computational …

When Does Perceptual Alignment Benefit Vision Representations?

S Sundaram, S Fu, L Muttenthaler, NY Tamir… - arXiv preprint arXiv …, 2024 - arxiv.org
Humans judge perceptual similarity according to diverse visual attributes, including scene
layout, subject location, and camera pose. Existing vision models understand a wide range …

Towards symbolic XAI–explanation through human understandable logical relationships between features

T Schnake, FR Jafari, J Lederer, P Xiong, S Nakajima… - Information …, 2025 - Elsevier
Abstract Explainable Artificial Intelligence (XAI) plays a crucial role in fostering transparency
and trust in AI systems. Traditional XAI methods typically provide a single level of abstraction …