From to : Your Language Model is Secretly a Q-Function

R Rafailov, J Hejna, R Park, C Finn - arXiv preprint arXiv:2404.12358, 2024 - arxiv.org
Reinforcement Learning From Human Feedback (RLHF) has been a critical to the success
of the latest generation of generative AI models. In response to the complex nature of the …

Controlled decoding from language models

S Mudgal, J Lee, H Ganapathy, YG Li, T Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
We propose controlled decoding (CD), a novel off-policy reinforcement learning method to
control the autoregressive generation from language models towards high reward …

Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models

S Dai, C Xu, S Xu, L Pang, Z Dong, J Xu - arXiv preprint arXiv:2404.11457, 2024 - arxiv.org
With the rapid advancement of large language models (LLMs), information retrieval (IR)
systems, such as search engines and recommender systems, have undergone a significant …

Controllable Text Generation for Large Language Models: A Survey

X Liang, H Wang, Y Wang, S Song, J Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
In Natural Language Processing (NLP), Large Language Models (LLMs) have demonstrated
high text generation quality. However, in real-world applications, LLMs must meet …

Bias and fairness in large language models: A survey

IO Gallegos, RA Rossi, J Barrow, MM Tanjim… - Computational …, 2024 - direct.mit.edu
Rapid advancements of large language models (LLMs) have enabled the processing,
understanding, and generation of human-like text, with increasing integration into systems …

Controlled text generation via language model arithmetic

J Dekoninck, M Fischer, L Beurer-Kellner… - arXiv preprint arXiv …, 2023 - arxiv.org
As Large Language Models (LLMs) are deployed more widely, customization with respect to
vocabulary, style and character becomes more important. In this work we introduce model …

Controllable Text Generation in the Instruction-Tuning Era

D Ashok, B Poczos - arXiv preprint arXiv:2405.01490, 2024 - arxiv.org
While most research on controllable text generation has focused on steering base
Language Models, the emerging instruction-tuning and prompting paradigm offers an …

Metasql: A generate-then-rank framework for natural language to sql translation

Y Fan, Z He, T Ren, C Huang, Y Jing, K Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
The Natural Language Interface to Databases (NLIDB) empowers non-technical users with
database access through intuitive natural language (NL) interactions. Advanced …

LifeTox: Unveiling Implicit Toxicity in Life Advice

M Kim, J Koo, H Lee, J Park, H Lee, K Jung - arXiv preprint arXiv …, 2023 - arxiv.org
As large language models become increasingly integrated into daily life, detecting implicit
toxicity across diverse contexts is crucial. To this end, we introduce LifeTox, a dataset …

Can LLMs Recognize Toxicity? Structured Toxicity Investigation Framework and Semantic-Based Metric

H Koh, D Kim, M Lee, K Jung - arXiv preprint arXiv:2402.06900, 2024 - arxiv.org
In the pursuit of developing Large Language Models (LLMs) that adhere to societal
standards, it is imperative to discern the existence of toxicity in the generated text. The …