The overhead isn't the setup. It's the memory footprint of a process. A minimal process (just sleeping) will use ~128KB. One process per connection means a server will need well over 1GB memory per 8000 connected users. A python process just sleeping will be almost 3MB, which means you get 333 connections per GB.
Compare this with an efficient green threads implementation. In Erlang you can have 100k sleeping "processes" in just 200MB of memory.