There are many other reasons for Erlang having worse overall performance.
Go even shipped with segmented stacks! Also separating goroutine heaps would've been cheap by comparison. It would've also made it easier to achieve good garbage collector throughput, which Go still struggles with to this day.
Go could've also had the `go` keyword work differently, by requiring/returning some kind of handle that must be waited to potentially panic, or explicitly ignored. It would've made it impossible to incorrectly use WaitGroups, without any additional runtime overhead.
Don't get me wrong, Go is still a useful language. It's just frustrating that it failed to be great language for no good reason.