Python was always more about breadth than depth. (CPython is full of known inefficiencies, but it's with us since 1989, and basically the core dev who worked most on performance - Victor Stinner - thinks the best way is to introduce subinterpreters -
https://github.com/vstinner/talks/blob/master/2019-EuroPytho... )
Oh, that PDF is interesting, Python 3.8 has shared memory for multiprocessing, no more pipe objects between processes.
Furthermore extension and internal stuff always had the ability to release the GIL and do its own thing (for example, on a threadpool, or using async/nonblocking I/O). But I have no idea about Gevent. I never liked it. (Just as Twisted/Tornado it was too much magic for too little benefit.)