if you want to limit the number of Ps, then you use a cpuset, that sched_getaffinity will take into account. cgroups only allows you to limit cpu usage, but not lower the number of cpu cores the code can run on. This is “how many” versus “how much”, and GOMAXPROCS only relates to the “how many” part.
I may have misunderstood the rationale here, but I think the discussion about cgroup support is not about limiting the number of Ps