What do we want from machine intelligence? We envision machines that are not just tools for thought but partners in thought: reasonable, insightful, knowledgeable, reliable and …
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories …
Modular and distributed coding theories of category selectivity along the human ventral visual stream have long existed in tension. Here, we present a reconciling framework …
The rapid release of high-performing computer vision models offers new potential to study the impact of different inductive biases on the emergent brain alignment of learned …
What makes large language models (LLMs) impressive is also what makes them hard to evaluate: their diversity of uses. To evaluate these models, we must understand the …
In recent years, increasingly complex machine learning methods have become state-of-the- art in modelling wind turbine power curves based on operational data. While these methods …
L Bereska, E Gavves - arXiv preprint arXiv:2404.14082, 2024 - arxiv.org
Understanding AI systems' inner workings is critical for ensuring value alignment and safety. This review explores mechanistic interpretability: reverse-engineering the computational …
Humans judge perceptual similarity according to diverse visual attributes, including scene layout, subject location, and camera pose. Existing vision models understand a wide range …
Abstract Explainable Artificial Intelligence (XAI) plays a crucial role in fostering transparency and trust in AI systems. Traditional XAI methods typically provide a single level of abstraction …