The linked article basically goes through a bunch of scenarios examining clumsy ways to chase a slightly-obscure requirement ("wake up exactly one thread per event using a single descriptor") just to land in the very last paragraph on the clearly correct solution ("just use ONESHOT, that's what it's for!").
I mean, there's literally a feature right there in the man page[1] that does exactly what the author wants. They just didn't want to learn about it and view their ignorance as a bug in the software.
Meh. This gets tiresome. As the article linked yesterday points out, epoll() is the "API that powers the internet", and has effectively solved the C10k problem for everyone that has it.
But yeah, you need to read the man page.
[1] No, seriously, it's right there in the man page, discussing exactly this scenario and how to avoid it.
The point is, do all of the other configurations for epoll have legitimate usecases justifying the complexity and need for those parameters? The kqueue design scales from single-threaded to multithreaded scenarios without issue and without all of these pitfalls, so why not just adopt that design? Why does the specific issue need to be a solution described in the man page at all?
> As the article linked yesterday points out, epoll() is the "API that powers the internet", and has effectively solved the C10k problem for everyone that has it.
Being able to solve a problem and doing it well are not the same. The latter arguably deserves criticism.
I think of stakeholder meetings and fighting business requirements?
I had an inotify project in process for creating a better developer experience. Seeing this hit Hacker News tells me it's a fucking political landmine.
Sigh.
Don't worry, I'll have the strength to argue the proper technical solution that still satisfies business needs. That's the important thing!
In a multi-threaded environment someone needs to pay the cost of synchronization if the entire event-queue is loadbalanced. If you don’t want the events to be trivially load-balanced and want the events from one fd to be delivered to a single worker, then it’s way better to use SO_REUSEPORT and get it right from the get go.
Expecting kernel to solve a problem of user-space’s making is asking for trouble - can be done, but the edge cases will sink your project!
These are problems the kernel has introduced. I'm not sure you've read article carefully enough.
Since you are supposed to use liburing, not the kernel interface directly, I guess somebody could add multithreading "support" to it.
Or at least add documentation/examples of the most common/performant options: https://github.com/shuveb/loti/issues/4
AFAIR Windows IOCP handles multithreading by:
- Handling locking at kernel level, the syscall is thread safe
- Making it LIFO, to keep things in the same threads, to have a decent cache behaviour.
It's as simple as it gets.
Wouldn't that be on purpose ? Coordination requires more cpu cycles and so cuts on max performance
> Why was it not ported to Linux?
We've been asking since epoll got introduced.
But here is some context: https://lwn.net/Articles/431297/
NIH
https://github.com/samsquire/epoll-server
I also have a 1:M:N (1 scheduler thread, M kernel threads and N lightweight green threads) multithreaded userspace scheduler which multiplexes lightweight threads onto kernel threads and can preempt hot loops with minimal overhead. I rely on the fact that you can change the looping variable from another thread if you use a structure. Preemptive interruption is very useful for the illusion of multitasking. That's why I call it a userspace scheduler.
https://GitHub.com/samsquire/preemptible-thread
I think the epoll-server which is kind of similar to what libuv does and the userspace scheduler could be combined into an application server.
I also wrote a multithreaded actor implementation in Java. Threads can communicate with each other between 60 million - 100 million messages a second. The epoll-server uses a multiconsumer multiproducer lockless RingBuffer.
https://GitHub.com/samsquire/multicersion-concurrency-contro...
I think the core fundamentals of building a performant application server should be done once and reused for each application.
I want to also split the threading used by recv and send of a socket so that we have a 1:RecvKernelThread SendKernelThread with 1 RK+SK assigned to Socket:N scheduling (1 scheduler thread, 1 assigned Recv thread, 1 assigned send thread per socket). So you can send while you receive and receive while you send. True multiplexing!
We can decouple CPU and IO completely with threading.