WEI SHEN 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	152	152
h 指数	5	5
i10 指数	4	4

0

140

70

35

105

2023202424 128

WEI SHEN

WEI SHEN

其他姓名沈蔚

Researcher at Baichuan.Inc, Fudan University

在 m.fudan.edu.cn 的电子邮件经过验证 - 首页

LLM Alignment RLHF


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Secrets of RLHF in Large Language Models Part I: PPO R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Q Liu, ... NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following (best …, 2023	69*	2023
Secrets of rlhf in large language models part ii: Reward modeling B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ... arXiv preprint arXiv:2401.06080, 2024	32*	2024
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou, Z Xi, X Wang, ... arXiv preprint arXiv:2312.09979, 2023	21*	2023
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback W Shen, R Zheng, W Zhan, J Zhao, S Dou, T Gui, Q Zhang, X Huang The 2023 Conference on Empirical Methods in Natural Language Processing, 2023	11	2023
Human-instruction-free llm self-alignment with limited samples H Guo, Y Yao, W Shen, J Wei, X Zhang, Z Wang, Y Liu arXiv preprint arXiv:2401.06785, 2024	6	2024
Overcoming reward overoptimization via adversarial policy optimization with lightweight uncertainty estimation X Zhang, JF Ton, W Shen, H Wang, Y Liu arXiv preprint arXiv:2403.05171, 2024	4	2024
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning R Zheng, W Shen, Y Hua, W Lai, S Dou, Y Zhou, Z Xi, X Wang, H Huang, ... Twelfth International Conference on Learning Representations (ICLR 2024 …, 2023	3	2023
Training large language models for reasoning through reverse curriculum reinforcement learning Z Xi, W Chen, B Hong, S Jin, R Zheng, W He, Y Ding, S Liu, X Guo, ... arXiv preprint arXiv:2402.05808, 2024	2	2024
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback S Dou, Y Liu, H Jia, L Xiong, E Zhou, J Shan, C Huang, W Shen, X Fan, ... arXiv preprint arXiv:2402.01391, 2024	2	2024
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards W Shen, X Zhang, Y Yao, R Zheng, H Guo, Y Liu arXiv preprint arXiv:2403.07708, 2024	1	2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback S Gao, Q Ge, W Shen, S Dou, J Ye, X Wang, R Zheng, Y Zou, Z Chen, ... arXiv preprint arXiv:2401.11458, 2024	1	2024

系统目前无法执行此操作，请稍后再试。

文章 1–11