I trained a robot to learn to play the game "puckworld" using RL (Q learning with experience replay, using pytorch). I had done some RL stuff before, but didn't appreciate how hard it could be getting something physical to work for the timescales necessary to do RL.
Please let me know if you have any questions or feedback!