Also, isn't the problem that JSON decoding (or whatever computation) simply block the thread and the other green threads cannot proceed at all, because there are simply no safepoints (yield points) inside these low level functions?
And in all these cases shouldn't the application estimate work (eg. in case of JSON if the string is longer than 100K), and if it's too big just put it on a dedicated heavy compute N:N thread pool?
For Python it's best practice anyway because the GIL, no?