Machine learning empowering personalized medicine: A comprehensive review of medical image analysis methods

I Galić, M Habijan, H Leventić, K Romić - Electronics, 2023 - mdpi.com
Artificial intelligence (AI) advancements, especially deep learning, have significantly
improved medical image processing and analysis in various tasks such as disease …

Nash learning from human feedback

R Munos, M Valko, D Calandriello, MG Azar… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) has emerged as the main paradigm
for aligning large language models (LLMs) with human preferences. Typically, RLHF …

Socially intelligent machines that learn from humans and help humans learn

H Gweon, J Fan, B Kim - Philosophical Transactions of …, 2023 - royalsocietypublishing.org
A hallmark of human intelligence is the ability to understand and influence other minds.
Humans engage in inferential social learning (ISL) by using commonsense psychology to …

Harms from increasingly agentic algorithmic systems

A Chan, R Salganik, A Markelius, C Pang… - Proceedings of the …, 2023 - dl.acm.org
Research in Fairness, Accountability, Transparency, and Ethics (FATE) 1 has established
many sources and forms of algorithmic harm, in domains as diverse as health care, finance …

Student of Games: A unified learning algorithm for both perfect and imperfect information games

M Schmid, M Moravčík, N Burch, R Kadlec… - Science …, 2023 - science.org
Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …

Avalon's game of thoughts: Battle against deception through recursive contemplation

S Wang, C Liu, Z Zheng, S Qi, S Chen, Q Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent breakthroughs in large language models (LLMs) have brought remarkable success
in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information …

Adversarial policies beat superhuman go AIs

TT Wang, A Gleave, T Tseng, K Pelrine… - International …, 2023 - proceedings.mlr.press
We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies
against it, achieving a $> $97% win rate against KataGo running at superhuman settings …

Honesty is the best policy: defining and mitigating AI deception

F Ward, F Toni, F Belardinelli… - Advances in Neural …, 2024 - proceedings.neurips.cc
Deceptive agents are a challenge for the safety, trustworthiness, and cooperation of AI
systems. We focus on the problem that agents might deceive in order to achieve their goals …

Polynomial-time linear-swap regret minimization in imperfect-information sequential games

G Farina, C Pipis - Advances in Neural Information …, 2024 - proceedings.neurips.cc
No-regret learners seek to minimize the difference between the loss they cumulated through
the actions they played, and the loss they would have cumulated in hindsight had they …

Hardness of independent learning and sparse equilibrium computation in markov games

DJ Foster, N Golowich… - … Conference on Machine …, 2023 - proceedings.mlr.press
We consider the problem of decentralized multi-agent reinforcement learning in Markov
games. A fundamental question is whether there exist algorithms that, when run …