> The challenge here is not IK.
No, a state-based RL research like this is essentially an IK problem. Given a goal position in the world frame, you need to find out the motor configuration to move your EEF to that goal position.
> IK to my knowledge is well known in every setting I am aware of.
Really? When I was working in this field, I've actually never seen anyone who used numerical/analytical IK methods on real robots.
Granted that I was not a Robotics engineer (I was a Deep Learning engineer in the team), but my impression at the time was that no practical IK solution was available for a robot with 8 DOF, let alone a 20 DOF robot like Shadow Hand.