关注
Zhanhui Zhou
Zhanhui Zhou
Shanghai AI Lab
在 pjlab.org.cn 的电子邮件经过验证
标题
引用次数
引用次数
年份
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
Z Zhou, J Liu, C Yang, J Shao, Y Liu, X Yue, W Ouyang, Y Qiao
arXiv preprint ArXiv:2310.03708, 2023
21*2023
Attacks, defenses and evaluations for llm conversation safety: A survey
Z Dong, Z Zhou, C Yang, J Shao, Y Qiao
NAACL 2024, 2024
152024
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
G Bai, J Liu, X Bu, Y He, J Liu, Z Zhou, Z Lin, W Su, T Ge, B Zheng, ...
ACL 2024, 2024
62024
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
Y Wu, J Liu, X Bu, J Liu, Z Zhou, Y Zhang, C Zhang, Z Bai, H Chen, T Ge, ...
Findings of ACL 2024, 2024
32024
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Z Zhou, J Liu, Z Dong, J Liu, C Yang, W Ouyang, Y Qiao
ACL 2024, 2024
22024
Intent: Interactive tensor transformation synthesis
Z Zhou, MT Tang, Q Pan, S Tan, X Wang, T Zhang
UIST 2022, 2022
22022
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
J Liu, Z Zhou, J Liu, X Bu, C Yang, HS Zhong, W Ouyang
arXiv preprint arXiv:2406.11817, 2024
2024
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
Z Zhou, Z Liu, J Liu, Z Dong, C Yang, Y Qiao
arXiv preprint arXiv:2405.19262, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–8