Supervised Fine-Tuning (SFT) on response demonstrations combined with Reinforcement Learning from Human Feedback (RLHF) constitutes a powerful paradigm for aligning LLM …
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL) that learns from human feedback instead of relying on an engineered reward function …
Should we care whether AI systems have representations of the world that are similar to those of humans? We provide an information-theoretic analysis that suggests that there …
Recommender systems are the algorithms which select, filter, and personalize content across many of the world's largest platforms and apps. As such, their positive and negative …
Interpretable machine learning has exploded as an area of interest over the last decade, sparked by the rise of increasingly large datasets and deep neural networks …
Increasingly complex and autonomous systems require machine ethics to maximize the benefits and minimize the risks to society arising from the new technology. It is challenging …
Big models, exemplified by Large Language Models (LLMs), are models typically pre- trained on massive data and comprised of enormous parameters, which not only obtain …
The book's core argument is that an artificial intelligence that could equal or exceed human intelligence—sometimes called artificial general intelligence (AGI)—is for mathematical …
Biological and artificial information processing systems form representations of the world that they can use to categorize, reason, plan, navigate, and make decisions. To what extent …