Learning to generate better than your llm

JD Chang, K Brantley, R Ramamurthy, D Misra… - arXiv preprint arXiv …, 2023 - arxiv.org
Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large
Language Models (LLMs) for conditional text generation. In particular, recent LLMs such as …

Learning natural locomotion behaviors for humanoid robots using human bias

C Yang, K Yuan, S Heng, T Komura… - IEEE Robotics and …, 2020 - ieeexplore.ieee.org
This letter presents a new learning framework that leverages the knowledge from imitation
learning, deep reinforcement learning, and control theories to achieve human-style …

Sampling-based exploration for reinforcement learning of dexterous manipulation

G Khandate, S Shang, ET Chang, TL Saidi… - arXiv preprint arXiv …, 2023 - arxiv.org
In this paper, we present a novel method for achieving dexterous manipulation of complex
objects, while simultaneously securing the object without the use of passive support …

ACDER: Augmented curiosity-driven experience replay

B Li, T Lu, J Li, N Lu, Y Cai… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
Exploration in environments with sparse feed-back remains a challenging research problem
in reinforcement learning (RL). When the RL agent explores the environment randomly, it …

Targeted search control in AlphaZero for effective policy improvement

A Trudeau, M Bowling - arXiv preprint arXiv:2302.12359, 2023 - arxiv.org
AlphaZero is a self-play reinforcement learning algorithm that achieves superhuman play in
chess, shogi, and Go via policy iteration. To be an effective policy improvement operator …

RR: Rapid eXploration for Reinforcement Learning via Sampling-based Reset Distributions and Imitation Pre-training

G Khandate, TL Saidi, S Shang, ET Chang… - arXiv preprint arXiv …, 2024 - arxiv.org
We present a method for enabling Reinforcement Learning of motor control policies for
complex skills such as dexterous manipulation. We posit that a key difficulty for training such …

Where2Start: Leveraging initial States for Robust and Sample-Efficient Reinforcement Learning

P Parsa, RZ Moayedi, M Bornosi, MM Bejani - arXiv preprint arXiv …, 2023 - arxiv.org
The reinforcement learning algorithms that focus on how to compute the gradient and
choose next actions, are effectively improved the performance of the agents. However, these …

Towards Specialized Reinforcement Learning From Diverse Data

JD Chang - 2024 - search.proquest.com
Reinforcement learning (RL) fundamentally focuses on teaching agents how to make
decisions by interacting with an environment. Unlike supervised learning approaches that …

[PDF][PDF] Real-Time Hybrid Visual Servoing of a Redundant Manipulator via Deep Reinforcement Learning

AJ Williams - 2023 - core.ac.uk
Fixtureless assembly may be necessary in some manufacturing tasks and environments due
to various constraints but poses challenges for automation due to nondeterministic …

Learning to Generate Better than your Large Language Models

JD Chang, K Brantley, R Ramamurthy, D Misra, W Sun - openreview.net
Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large
Language Models (LLMs) for text generation. In particular, recent LLMs such as ChatGPT …