Dataset Reset Policy Optimization for RLHF JD Chang, W Shan, O Oertell, K Brantley, D Misra, JD Lee, W Sun arXiv preprint arXiv:2404.08495, 2024 | 14 | 2024 |
REBEL: Reinforcement Learning via Regressing Relative Rewards Z Gao, JD Chang, W Zhan, O Oertell, G Swamy, K Brantley, T Joachims, ... arXiv preprint arXiv:2404.16767, 2024 | 8 | 2024 |
More benefits of being distributional: Second-order bounds for reinforcement learning K Wang, O Oertell, A Agarwal, N Kallus, W Sun arXiv preprint arXiv:2402.07198, 2024 | 4 | 2024 |
RL for Consistency Models: Faster Reward Guided Text-to-Image Generation O Oertell, JD Chang, Y Zhang, K Brantley, W Sun arXiv preprint arXiv:2404.03673, 2024 | 1 | 2024 |
Overdetermined Eigenvector Approach to Passive Angles-Only Relative Orbit Determination J Kulik, O Oertell, D Savransky Journal of Guidance, Control, and Dynamics 47 (5), 986-994, 2024 | | 2024 |
A Kernel Method Approach to Orbital Debris Blast Point Determination J Kulik, O Oertell, D Savransky AIAA SCITECH 2024 Forum, 1864, 2024 | | 2024 |