Illustrating reinforcement learning from human feedback (rlhf) N Lambert, L Castricato, L von Werra, A Havrilla Hugging Face Blog 9, 2022 | 102 | 2022 |
Arb: Advanced reasoning benchmark for large language models T Sawada, D Paleka, A Havrilla, P Tadepalli, P Vidas, A Kranias, JJ Nay, ... arXiv preprint arXiv:2307.13692, 2023 | 47 | 2023 |
trlX: A framework for large scale reinforcement learning from human feedback A Havrilla, M Zhuravinskyi, D Phung, A Tiwari, J Tow, S Biderman, ... Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 26 | 2023 |
Teaching large language models to reason with reinforcement learning A Havrilla, Y Du, SC Raparthy, C Nalmpantis, J Dwivedi-Yu, ... arXiv preprint arXiv:2403.04642, 2024 | 25 | 2024 |
Sharp Khinchin-type inequalities for symmetric discrete uniform random variables A Havrilla, T Tkocz Israel Journal of Mathematics 246 (1), 281-297, 2021 | 15 | 2021 |
Glore: When, where, and how to improve llm reasoning via global and local refinements A Havrilla, S Raparthy, C Nalmpantis, J Dwivedi-Yu, M Zhuravinskyi, ... arXiv preprint arXiv:2402.10963, 2024 | 14 | 2024 |
Khinchin-type inequalities via Hadamard’s factorisation A Havrilla, P Nayar, T Tkocz International Mathematics Research Notices 2023 (3), 2429-2445, 2023 | 7 | 2023 |
trlX: A scalable framework for RLHF L Castricato, A Havrilla, S Matiana, DV Phung, A Tiwari, J Tow, ... Zenodo. DOI 10, 2023 | 7 | 2023 |
Robust preference learning for storytelling via contrastive reinforcement learning L Castricato, A Havrilla, S Matiana, M Pieler, A Ye, I Yang, S Frazier, ... arXiv preprint arXiv:2210.07792, 2022 | 7 | 2022 |
trlX: A scalable framework for RLHF, June 2023 L Castricato, A Havrilla, S Matiana, DV Phung, A Tiwari, J Tow, ... URL https://github. com/CarperAI/trlx, 0 | 7 | |
Understanding the Effect of Noise in LLM Training Data with Algorithmic Chains of Thought A Havrilla, M Iyer arXiv preprint arXiv:2402.04004, 2024 | 6 | 2024 |
On deep generative models for approximation and estimation of distributions on manifolds B Dahal, A Havrilla, M Chen, T Zhao, W Liao Advances in Neural Information Processing Systems 35, 10615-10628, 2022 | 6 | 2022 |
Deep nonparametric estimation of intrinsic data structures by chart autoencoders: Generalization error and robustness H Liu, A Havrilla, R Lai, W Liao Applied and Computational Harmonic Analysis 68, 101602, 2024 | 4 | 2024 |
Deep nonparametric estimation of intrinsic data structures by chart autoencoders: Generalization error and robustness H Liu, A Havrilla, R Lai, W Liao arXiv preprint arXiv:2303.09863, 2023 | 4 | 2023 |
A study on improving reasoning in language models Y Du, A Havrilla, S Sukhbaatar, P Abbeel, R Raileanu I Can't Believe It's Not Better Workshop: Failure Modes in the Age of …, 2023 | 2 | 2023 |
Understanding Scaling Laws with Statistical and Approximation Theory for Transformer Neural Networks on Intrinsically Low-dimensional Data A Havrilla, W Liao arXiv preprint arXiv:2411.06646, 2024 | | 2024 |
Dual Fourier Unet: scale-robust diffusion model for zero-shot super-resolution image generation A Havrilla, K Rojas, W Liao, M Tao NeurIPS 2023 Workshop on Diffusion Models, 2023 | | 2023 |
DFU: scale-robust diffusion model for zero-shot super-resolution image generation A Havrilla, K Rojas, W Liao, M Tao arXiv preprint arXiv:2401.06144, 2023 | | 2023 |
On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds B Dahal, A Havrilla, M Chen, T Zhao, W Liao arXiv preprint arXiv:2302.13183, 2023 | | 2023 |
synthetic-instruct-gptj-pairwise A Havrilla Huggingface, 2023 | | 2023 |