The issue is, that Go runtime has it's own ideas how the memory layout and stack layout should look like and these ideas are not compatible with __stdcall or __cdecl, so when calling C code, the runtime has to do a clean up. That used to involve a thread switch, now it involves a full register swap.
What Go does not allow, is putting Go code into dynamic library and then calling that from another go application - i.e. native go plugins for go apps.