The fix is to inspect the cgroupfs to see how many CPU shares you can utilize and then set gomaxprocs to match that. I think other runtime like Java and .NET do this automatically.
It is the same thing with GOMEMLIMIT, I don’t see why the runtime does not inspect cgroupfs and set GOMEMLIMIT to 90% of the cgroup memory limit.
https://cs.opensource.google/go/go/+/master:src/runtime/os_l...
> On Linux, go uses sched_getaffinity …
Since cgroups are a Linux-only feature, OP must be running Linux. I wonder if his experience pre-dates Go’s usage of sched_getaffinity.
edit: I realised that he references cgroups so must be on Linux.
This, I think, is cgroups 1 vs. cgroups 2 and everyone should have cgroups 2 now, but ... it would feel weird for the Go runtime to decide on one. To me, anyway.
I would think that cgroupfs is considered an API to userspace and therefore it shouldn’t break in the future? Hence creating cgroups v2?
I have written code which handles both cgroups v1 and cgroups v2, it isn’t terribly hard. Golang could also only support setting automatic parameters when running in cgroups v2 if that made things easier.
For a language that prides itself in sane defaults I think they have missed the mark here. I could probably add support to the golang runtime in a few hundred lines of code and probably save millions of dollars and megawatts of energy because the go runtime is not spawning 50 processes to run a program which is constrained to 1 core.
https://kubernetes.io/docs/tasks/administer-cluster/cpu-mana...
Gold.
Here's `isAsyncSafePoint`: https://github.com/golang/go/blob/d36353499f673c89a267a489be...
edit: The comments at the top of that file say:
// 3. Asynchronous safe-points occur at any instruction in user code
// where the goroutine can be safely paused and a conservative
// stack and register scan can find stack roots. The runtime can
// stop a goroutine at an async safe-point using a signal.GopherCon 2020: Austin Clements - Pardon the Interruption: Loop Preemption in Go 1.14
I don’t have the link ready, but twitch had this kind of issue with base64 decoding in some kind of servers. The GC would try to STW, but there would always be one or a few goroutines decoding base64 in a tight loop for the time STW was attempted, delaying it again and again.
Asynchronous preemption is a solution to this kind of issue. Load is not the issue here, as long as you go through the runtime often enough.
Then there's the async case for tight loops that I remember reading about back in 2020 (it uses unix signals), but don't yet fully grok the specifics.