I offered to beat whatever you've done by tweaking the Py3 stdlib. Not by writing a plain C implementation.
If you for some reason doubt that this old python thing is of the real world - let me disappoint you. It was done because nothing else could do those 100K rps back then. And it did the thing for five years, until the whole stack was ditched.