between safety and helpfulness can significantly impact user experience. A model that
prioritizes safety will cause users to feel less engaged and assisted while prioritizing
helpfulness will potentially cause harm. Possible harms include teaching people how to
build a bomb, exposing youth to inappropriate content, and hurting users' mental health. In
this work, we propose to balance safety and helpfulness in diverse use cases by controlling …