For artificial intelligence to be beneficial to humans the behaviour of AI agents needs to be aligned with what humans want. In this paper we discuss some behavioural issues for …
Explainability of AI systems is critical for users to take informed actions and hold systems accountable. While" opening the opaque box" is important, understanding who opens the …
Building artificial intelligence (AI) that aligns with human values is an unsolved problem. Here we developed a human-in-the-loop research pipeline called Democratic AI, in which …
We are witnessing a novel era of creativity where anyone can create digital content via prompt-based learning (known as prompt engineering). This paper delves into prompt …
The philosopher John Rawls proposed the Veil of Ignorance (VoI) as a thought experiment to identify fair principles for governing a society. Here, we apply the VoI to an important …
Numerous parties are calling for “the democratisation of AI”, but the phrase is used to refer to a variety of goals, the pursuit of which sometimes conflict. This paper identifies four kinds of …
RL Johnson, G Pistilli, N Menédez-González… - arXiv preprint arXiv …, 2022 - arxiv.org
The alignment problem in the context of large language models must consider the plurality of human values in our world. Whilst there are many resonant and overlapping values …
Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale. However, updates are necessary to endow …
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories …