Actually that what I've been saying for years until users complained that processes had different vision of server states in fast changing environments, that it was not possible to perform dynamic updates at runtime because all of them had to be synchronized, that stick-tables were only on one process, that they had to manage their stats and sum them externally, that the LB algorithms were uneven with many processes because roundrobin starts from the same server on all processes, and leastconn didn't know other processes' load, that the maxconn size was unmanageable as it couldn't consider connections from other processes, that rate counters weren't shared, that maps and ACLs had to be loaded and managed in every process, eating a lot of memory or not being updatable, that memory usage was going through the roof because unused memory from one process was not usable by another one, etc... The list was so long that it was clear at some point that we had to move to threads to resolve all of this at once. And we're doing really good, I still hate sharing data so we're extremely careful to share very little. Look at the pools, the thread_info struct and the thread_group struct to see how much is local to the thread or to the group. Even the idle server connections are thread-local but can be stolen if needed (i.e. almost never any sharing unless really needed). So we've kept the old practices of processes with the facilities offered by threads, and that's what brought us this great scalability.