所有版本 - 学术资源搜索

Principle-driven self-alignment of language models from scratch with minimal human supervision

Z Sun, Y Shen, Q Zhou, H Zhang… - Advances in …, 2024 - proceedings.neurips.cc

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …

被引用次数：200 相关文章

Principle-driven self-alignment of language models from scratch with minimal human supervision

Z Sun, Y Shen, Q Zhou, H Zhang, Z Chen… - Proceedings of the 37th …, 2023 - dl.acm.org

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …

[引用][C] Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Z Sun - cs.cmu.edu

publications | Zhiqing Sun Zhiqing Sun Toggle navigation about publications(current)
publications For a more up-to-date list, please also check the google scholar page. 2023 1.NeurIPS …

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Z Sun, Y Shen, Q Zhou, H Zhang… - Advances in …, 2023 - proceedings.neurips.cc

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Z Sun, Y Shen, Q Zhou, H Zhang, Z Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Z Sun, Y Shen, Q Zhou, H Zhang, Z Chen… - … -seventh Conference on … - openreview.net

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Z Sun, Y Shen, Q Zhou, H Zhang, Z Chen… - Annual Conference …, 2023 - research.ibm.com

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Z Sun, Y Shen, Q Zhou, H Zhang, Z Chen… - arXiv e …, 2023 - ui.adsabs.harvard.edu

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning
(SFT) with human annotations and reinforcement learning from human feedback (RLHF) to …

高级搜索

QQ 群

Principle-driven self-alignment of language models from scratch with minimal human supervision

Principle-driven self-alignment of language models from scratch with minimal human supervision

[引用][C] Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

引用