I’m actually very interested in imitation learning and am building a system to do exactly this. It can be done much cheaper now, probably for about $7k total. I’ve replicated ALOHA’s ACT results using a cheap $250 table top arm but am very interested in doing something bigger, more complex, and useful. RoBart originally started as a way to build a cheap mobile base for a robot that could be operated and trained at home. Turns out that the hoverboard based set up isn’t quite optimal so for fun, I wanted to see how far I could push GPT-4 and Claude to autonomously operate it.
Did you find it useful? Did you use any other ML at the same time
I think about the different aims of of YOLO object recognition vs LLM conversational vs RL for planning and how one might integrate them for a better overall system
Re: LLM for supervising an imitation learning policy, check out the recent paper “Robot Utility Models” :) They even bought a TLD for it: https://robotutilitymodels.com/