Life Long Learning for Robot Navigation
Improving Policy Adaptation for Robot Learning
This project investigates a method to improve the adaptation of a robot’s control policy through limited teleoperation data. Specifically, we analyze the effect of our proposed approach under the assumption that the base model π0 is either a foundation model or a suboptimal model trained on the target robot’s data.
We evaluate the adaptation performance under a fixed human teleoperation time constraint, ensuring that the improvement in policy quality is achieved within practical limitations.
Key Claim
- Given that the base model π0 is either a foundation model or a suboptimal model trained on the target robot’s data
- Under the constraint that the amount of teleoperation time is fixed,
- Our proposed method improves policy adaptation, making the resulting policy π∗ closer to the optimal policy for the target robot.