A comprehensive survey of artificial intelligence techniques for talent analytics

C Qin, L Zhang, Y Cheng, R Zha, D Shen… - arXiv preprint arXiv …, 2023 - arxiv.org
In today's competitive and fast-evolving business environment, it is a critical time for
organizations to rethink how to make talent-related decisions in a quantitative manner …

The wmdp benchmark: Measuring and reducing malicious use with unlearning

N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti… - arXiv preprint arXiv …, 2024 - arxiv.org
The White House Executive Order on Artificial Intelligence highlights the risks of large
language models (LLMs) empowering malicious actors in developing biological, cyber, and …

Llm evaluators recognize and favor their own generations

A Panickssery, SR Bowman, S Feng - arXiv preprint arXiv:2404.13076, 2024 - arxiv.org
Self-evaluation using large language models (LLMs) has proven valuable not only in
benchmarking but also methods like reward modeling, constitutional AI, and self-refinement …

Recommendation with generative models

Y Deldjoo, Z He, J McAuley, A Korikov… - arXiv preprint arXiv …, 2024 - arxiv.org
Generative models are a class of AI models capable of creating new instances of data by
learning and sampling from their statistical distributions. In recent years, these models have …

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?

R Ren, S Basart, A Khoja, A Gatti, L Phan, X Yin… - arXiv preprint arXiv …, 2024 - arxiv.org
As artificial intelligence systems grow more powerful, there has been increasing interest in"
AI safety" research to address emerging and future risks. However, the field of AI safety …

Quiet-star: Language models can teach themselves to think before speaking

E Zelikman, G Harik, Y Shao, V Jayasiri… - arXiv preprint arXiv …, 2024 - arxiv.org
When writing and talking, people sometimes pause to think. Although reasoning-focused
works have often framed reasoning as a method of answering questions or completing …

Beyond static AI evaluations: advancing human interaction evaluations for LLM harms and risks

L Ibrahim, S Huang, L Ahmad, M Anderljung - arXiv preprint arXiv …, 2024 - arxiv.org
Model evaluations are central to understanding the safety, risks, and societal impacts of AI
systems. While most real-world AI applications involve human-AI interaction, most current …

Bells: A framework towards future proof benchmarks for the evaluation of llm safeguards

D Dorn, A Variengien, CR Segerie… - arXiv preprint arXiv …, 2024 - arxiv.org
Input-output safeguards are used to detect anomalies in the traces produced by Large
Language Models (LLMs) systems. These detectors are at the core of diverse safety-critical …

Accounting for AI and Users Shaping One Another: The Role of Mathematical Models

S Dean, E Dong, M Jagadeesan, L Leqi - arXiv preprint arXiv:2404.12366, 2024 - arxiv.org
As AI systems enter into a growing number of societal domains, these systems increasingly
shape and are shaped by user preferences, opinions, and behaviors. However, the design …

Human expertise in algorithmic prediction

R Alur, M Raghavan, D Shah - The Thirty-eighth Annual …, 2024 - openreview.net
We introduce a novel framework for incorporating human expertise into algorithmic
predictions. Our approach leverages human judgment to distinguish inputs which are …