Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Threats by artificial intelligence to human health and human existence

F Federspiel, R Mitchell, A Asokan, C Umana… - BMJ global …, 2023 - gh.bmj.com
While artificial intelligence (AI) offers promising solutions in healthcare, it also poses a
number of threats to human health and well-being via social, political, economic and security …

Gpt-4 technical report

J Achiam, S Adler, S Agarwal, L Ahmad… - arXiv preprint arXiv …, 2023 - arxiv.org
We report the development of GPT-4, a large-scale, multimodal model which can accept
image and text inputs and produce text outputs. While less capable than humans in many …

Are aligned neural networks adversarially aligned?

N Carlini, M Nasr… - Advances in …, 2024 - proceedings.neurips.cc
Large language models are now tuned to align with the goals of their creators, namely to be"
helpful and harmless." These models should respond helpfully to user questions, but refuse …

The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - arXiv preprint arXiv …, 2023 - arxiv.org
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …

Regulating ChatGPT and other large generative AI models

P Hacker, A Engel, M Mauer - Proceedings of the 2023 ACM Conference …, 2023 - dl.acm.org
Large generative AI models (LGAIMs), such as ChatGPT, GPT-4 or Stable Diffusion, are
rapidly transforming the way we communicate, illustrate, and create. However, AI regulation …

Towards automated circuit discovery for mechanistic interpretability

A Conmy, A Mavor-Parker, A Lynch… - Advances in …, 2023 - proceedings.neurips.cc
Through considerable effort and intuition, several recent works have reverse-engineered
nontrivial behaviors oftransformer models. This paper systematizes the mechanistic …

Ethical principles for artificial intelligence in education

A Nguyen, HN Ngo, Y Hong, B Dang… - Education and …, 2023 - Springer
The advancement of artificial intelligence in education (AIED) has the potential to transform
the educational landscape and influence the role of all involved stakeholders. In recent …

Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned

D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai… - arXiv preprint arXiv …, 2022 - arxiv.org
We describe our early efforts to red team language models in order to simultaneously
discover, measure, and attempt to reduce their potentially harmful outputs. We make three …

The stable signature: Rooting watermarks in latent diffusion models

P Fernandez, G Couairon, H Jégou… - Proceedings of the …, 2023 - openaccess.thecvf.com
Generative image modeling enables a wide range of applications but raises ethical
concerns about responsible deployment. This paper introduces an active strategy combining …