What I am finding now is that python is working more than fine on M1, be it through rosetta or native code, which is consistent with your experience.
Also to clear things a bit, I parsed your "3.6 -> 3.9" as meaning that you had moved from Py36 on intel to Py39 on apple, which is why I asked if maybe there were speedups or differences from moving to 3.9.