关注
Samrat Phatale
Samrat Phatale
Google DeepMind
在 google.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Rlaif: Scaling reinforcement learning from human feedback with ai feedback
H Lee, S Phatale, H Mansoor, K Lu, T Mesnard, C Bishop, V Carbune, ...
arXiv preprint arXiv:2309.00267, 2023
2472023
RLAIF: Scaling reinforcement learning from human feedback with ai feedback, 2024
H Lee, S Phatale, H Mansoor, T Mesnard, J Ferret, K Lu, C Bishop, E Hall, ...
URL https://openreview. net/forum, 0
5
Prose for a painting
P Kashyap, S Phatale, I Drori
arXiv preprint arXiv:1910.03634, 2019
32019
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
H Sidahmed, S Phatale, A Hutcheson, Z Lin, Z Chen, Z Yu, J Jin, ...
arXiv preprint arXiv:2403.10704, 2024
22024
Improve Mathematical Reasoning in Language Models by Automated Process Supervision
L Luo, Y Liu, R Liu, S Phatale, H Lara, Y Li, L Shu, Y Zhu, L Meng, J Sun, ...
arXiv preprint arXiv:2406.06592, 2024
12024
SAFE: Software-defined authentication framework
AV Kamath, K Kataoka, N Vijayvergiya, GB Reddy, S Phatale
Proceedings of the 12th Asian Internet Engineering Conference, 57-63, 2016
12016
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
H Lee, S Phatale, H Mansoor, T Mesnard, J Ferret, KR Lu, C Bishop, ...
Forty-first International Conference on Machine Learning, 0
1
Conversational Recommendation as Retrieval: A Simple, Strong Baseline
R Gupta, R Aksitov, S Phatale, S Chaudhary, H Lee, A Rastogi
arXiv preprint arXiv:2305.13725, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–8