All of which is to say, I don't think it matters. Use the right tool for the job - if you care about generic overhead, golang is not the right thing to use in the first place.
I think what's more important for the systems programmer is (1) the ability to inspect the low-level behavior of functions, like through their disassembly; (2) be reasonably confident how code will compile; and (3) have some dials and levers to control aspects of compiled code and memory usage. All of these things can and are present, not only in some garbage collected languages, but also garbage-collected languages with a dynamic type system!
Yes, there are environments so spartan and so precision-oriented that even a language's built-in allocator cannot be used (e.g., malloc), in which case using a GC'd language is going to be an unwinnable fight for control. But if you only need to do precision management of a memory that isn't pervasive in all of your allocation patterns, then using a language like C feels like throwing the baby out with the bath water. It's very rarely "all or nothing" in a modern, garbage-collected language.
The tricky thing is GC works most of the time, but if you are working at scale you really can't predict user behavior, and so all of those GC-tuning parameters that were set six months ago no longer work properly. A good portion of production outages are likely related to cascading failures due to too long GC pauses, and a good portion of developer time is spent testing and tuning GC parameters. It is easier to remove and/or just not allow GC languages at these levels in the first place.
On the other hand IMO GC-languages at the frontend level are OK since you'd just need to scale horizontally.
I've seen this sentiment a lot, and I never see specifics. "GC is bad for systems language" is an unsupported, tribalist, firmly-held belief that is unsupported by hard data.
On the other hand, huge, memory-intensive and garbage-collected systems have been deployed in vast numbers by thousands of different companies for decades, long before Go, within acceptable latency bounds. And shoddy, poorly performing systems have been written in C/C++ and failed spectacularly for all kinds of reasons.
I would argue it's not (very) hard data that we need in this case. My opinion is that the resource usage of infrastructure code should be as low as possible so that most resources are available to run applications.
The economic viability of application development is very much determined by developer productivity. Many applications aren't even run that often if you think of in-house business software for instance. So application development is where we have to spend our resource budget.
Systems/infrastructure code on the other hand is subject to very different economics. It runs all the time. The ratio of development time to runtime is incredibly small. We should optimise the heck out of infrastructure code to drive down resource usage whenever possible.
GC has significant memory and CPU overhead. I don't want to spend double digit resource percentages on GC for software that could be written differently without being uneconomical.
Programs in un-managed languages can be slow too, and excessive use of malloc() is a frequent culprit. But the difference is that if I have a piece of code that is slow because it is calling malloc() too much, I can often (or at least some of the time) just remove the malloc() calls from that function. I don't have to boil the ocean and significantly reduce the rate at which my entire program allocates memory.
I think another factor that gets ignored is how much you care about tail latency. I think GC is usually fine for servers and other situations where you are targeting a good P99 or P99.9 latency number. And indeed, this is where JVM, Go, node.js, and other GCed runtimes dominate.
But, there are situations, like games, where a bad P99.9 frame time means dropping a frame every 15 seconds (at 60fps). If you've got one frame skip every 10 seconds because of garbage collection pauses and you want to get to one frame skip every minute, that is _not_ an easy problem to fix.
(Yes, I am aware that many commercial game engines have garbage collectors).
Are you writing an application where Go's garbage collector will perform poorly relative to rolling your own memory management?
Maybe, those applications exist, but maybe not, it shouldn't be presumed.
I'm more open to the argument from definition, which might be what you mean by 'inherently': there isn't an RFC we can point to for interpreting what a systems language is, and it could be useful to have a consensus that manual memory management is a necessary property of anything we call a systems language.
No such consensus exists, and arguing that Go is a poor choice for $thing isn't a great way to establish whether it is or is not a systems language.
Go users certainly seem to think so, and it's not a molehill I wish to die upon.
I think Go makes a better trade-off than Java, but I struggle to come up with decent examples of projects one could write in Go and not in Java. Most of the “systems” problems that Java is unsuitable for, also apply to Go.
It can be a right old bugger - I've been tweaking gron's memory usage down as a side project (e.g. 1M lines of my sample file into original gron uses 6G maxrss/33G peak, new tweaked uses 83M maxrss/80M peak) and there's a couple of pathological cases where the code seems to spend more time GCing than parsing, even with `runtime.GC()` forced every N lines. In C, I'd know where my memory was and what it was doing but even with things like pprof, I'm mostly in the dark with Go.
First you'd have to establish what "systems" means. That, you'll find, is all over the place. Some people see systems as low level components like the kernel, others the userland that allows the user to operate the computer (the set of Unix utilizes, for example), you're suggesting databases and things like that.
The middle one, the small command line utilities that allow you to perform focused functions, is a reasonably decent fit for Go. This is one of the places it has really found a niche.
What's certain is that the Go team comes from a very different world to a lot of us. The definitions they use, across the board, are not congruent with what you'll often find elsewhere. Systems is one example that has drawn attention, but it doesn't end there. For example, what Go calls casting is the opposite of what some other languages call casting.
People sometimes use it for type conversions but that's in line with usage elsewhere, no?
It does have one niche, that it includes most if not everything you need run a network-based service(or micro-service), e.g http,even https, dns...are baked in. You no longer need to install openssl on windows for example, in golang one binary will include all of those(with CGO disabled too).
I do system programming in c and c++, maybe rust later when I have time to grasp that, there is no way for using Go there.
For network related applications, Go thus far is my favorite, nothing beat it, one binary has its all, can't be easier to upgrade in the field too.
Go does contain many of the things of interest to systems programmers such as pointers and the ability to specify memory layout of data structures. You can make your own secondary allocators. In short it gives you far more fine grained control over how memory is used than something like Java or Python.
https://erik-engheim.medium.com/is-go-a-systems-programming-...
But wouldn't this classify Java as a systems language? Java is used to build DBs and I believe AWS's infrastructure is mostly java. Plus, Java definitely has pointers.
Jokes aside, this is kind of a fundamental problem with the term, and many terms around classifying programs. Also worth noting - "program" is a term that is a lot looser than people who typically live above the kernel tend to think.
is that factual, in the general case?
it seems there exists a category of Go programs for which escape analysis entirely obviates heap allocations, in which case if there is any garbage collection it originates in the statically linked runtime.
This is where you're wrong.
What would you consider "high level productivity" then? Java? Ruby? Ruby on Rails?
That is a huge leap you are making there that I don’t think is exactly justified.
Depends on what you're measuring as performance.
Server request throughput? Being middleware API server to relay data between a frontend and a backend?
C and C++ dont really have package management to speak of, its basically "figure it out yourself". I tried Rust a couple of times, but the Result/Option paradigm basically forces you into this deeply nested code style that I hate.
The issue is about positioning of Go as a language. It's confusing due to being (formerly) marketed as a "systems programming language" that is typically a domain of C/C++/Rust, but technically Go fits closer to capabilities of Java or TypeScript.
I mean do we need bespoke package management tooling for everything now?
Seems like an outdated systems admin meme that violates KISS, explodes dependency chains, risks security, etc. IT feels infected by sunk cost fallacy.
It’s electron state in machines. The less altogether the better.
I hear this complaint often, but I consider it a feature of C. You end up with much less third party dependencies, and the libraries you do end up using have been battle tested for decades. I much prefer that to having to install hundreds of packages just to check if a number is even, like in JS.
There are just not enough statically typed languages that don't use a GC.
All in all I think the semantics debate is irrelevant. No one is going to use go for an OS only because someone on internet calls it a systems language.
Druid is Java and very fast but not like for like as it's an event database not a timeseries database. Pinot is in the same vein.
Most of the very big and very fast databases you have used indirectly though web services like Netflix (Cassandra), etc are written in Java.
> Overall, this may have been a bit of a disappointment to those who expected to use Generics as a powerful option to optimize Go code, as it is done in other systems languages.
where the implementation would smartly inline code and have performance no worse than doing so manually. I quite appreciated the call to attention that there's a nonobvious embedded footgun.
(As a side note, this design choice is quite interesting, and I appreciate the author diving into their breakdown and thoughts on it!)
So no, generics do not de facto make code slower.
Templated symbol names are gigantic. This can impact program link and load times significantly in addition to the inflated binary size.
Duplication of identical code for every type, for example the methods of std::vector<int> and std::vector<unsigned int> should compile to the same instructions. There are linker flags that allow some deduplication but those have their own drawbacks, another trick is to actively use void pointers for code parts that do not need to know the type, allowing them to be reused behind a type safe template based API.
Monomorphization: * code bloat * slow compiles * debug builds may be slow (esp c++)
Dynamic dispatch & boxing (Usually both are needed): * not zero cost
Pick your poison
The difference in the code I’m working with is being able to handle 250 req/s in node versus 50,000 req/s in Go without me doing any performance optimizations.
From my understanding Go was written with developer ergonomics first and performance is a lower priority. Generics undoubtedly make it a lot easier to write and maintain complex code. That may come at a performance cost but for the work I do even if it cuts the req/s in half I can always throw more servers at the problem.
Now if I was writing a database or something where performance is paramount I can understand where this can be a concern, it just isn’t for me.
I’d be very curious what orgs like CockroachDB and even K8s think about generics at the scale they’re using them.
One of the major pain points we have with Go is the lack of language support for monomorphization. We rely on a hand-built monomorphizing code generator [0] to compile CockroachDB's vectorized SQL engine [1]. Vectorized SQL is about producing efficient, type and operator specific code for each SQL operator. As such, we rely on true monomorphization to produce a performant SQL query engine.
I have a hope that, eventually, Go Generics will be performant enough to support this use case. As the author points out, there is nothing in the spec that prevents this potential future! That future is not yet here, but that's okay.
There are probably some less performance-sensitive use cases within CockroachDB's code base that could benefit from generics, but we haven't spend time looking into it yet.
[0]: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql...
[1]: https://www.cockroachlabs.com/blog/how-we-built-a-vectorized...
Feels like this approach could be leveraged to get default parameters / optional parameters in Golang too! The Go AST / token lib seems ridiculously flexible.
Huge CockroachDB fan btw. Thanks for the revolution in databases!
PS: Showcase of the awesome Go AST / token lib if you're a Python fan: http://igo.herokuapp.com/
True developer ergonomics, as far as a programming language itself goes, stems from language features, which make goals easy to accomplish in little amount of code, in a readable way, using well crafted concepts of the language. Having to go to lengths, because your lang does not support programming language features (like generics in Go for a long time) is not developer ergonomics.
There is the aspect tooling for a language of course, but that has not necessarily to do with programmming language design. Same goes for standard library.
I think in this context tho, developer ergonomics can mean different things to different people.
It's easy to see how "Orthogonal Features" can be interpreted as developer ergonomics, as its explicitly limiting potential (not all) anti-patterns and produces fairly idiomatic code across the ecosystem. I'm able to go to almost any Github repo that contains Go code, and easily determine whats going on, whats the flow, etc. Certainly ergonomic in that context.
[0]: https://go.dev/talks/2010/ExpressivenessOfGo-2010.pdf
[1]: https://www.informit.com/articles/article.aspx?p=1623555
Your node code should be in the 2k reqs/s range trivially, with many frameworks comfortable offering 5k+.
It is never going to be as fast as go, but it will handle most cases.
I wouldn't be surprised if we get some control over monomorphization down the line, but if Go started with the monomorphization approach, it would be impossible to back out of it because it would cause performance regressions. Starting with the shape stenciling approach means that introducing monomorphization later can give you a performance improvement.
I'm not trying to predict whether we'll get monomorphization at some future point in Go, but I'm just saying that at least the door is open.
It can in the naive implementation. Early C++ was famous for code bloat and (apparently) hasn't shaken that outdated impression.
In practice, monomorphization of templates hasn't been a serious issue in C++ for a long time. The compiler and linker technologies have advanced significantly.
AFAICT the linker de-duplicates identical pieces of machine code. You still can get multi-megabyte object files for every source file. I used to work on V8. Almost every .o is 3+MB. Times hundreds, plus bigger ones, it's more than a gigabyte of object files for a single build. That's absurd. Not V8's fault--stupid C++ compilation and linking model.
It's not an outdated impression. C++ generics can and do interact very poorly with inlining and other language features to cause extremely large binary sizes, especially if you do anything complex inside them. They also harm compilation performance since each copy of the generic code needs to be optimized.
Generics in C++ are reasonably efficient when there is relatively little code generated per generic, but when this is not true, they can be a problem.
Because footprint of the executable has pretty literally never been, Go has always had deficient DCE and generated huge executables.
In the grand space of all programming languages, Go is fast. In the space of compiled programming languages, it's on the slower end. If you're in a "counting CPU ops" situation it's not a good choice.
There is an intermediate space in which one is optimizing a particular tight loop, certainly, I've been there, and this can be nice to know. But if it's beyond "nice to know", you have a problem.
I don't know what you're doing with reflection but the odds are that it's wildly slower than anything in that article though, because of how it works. Reflection is basically like a dynamically-typed programming language runtime you can use as a library in Go, and does the same thing dynamically-typed languages (modulo JIT) do on their insides, which is essentially deal with everything through an extra layer of indirection. Not just a function call here or there... everything. Reading a field. Writing a field. Calling a function, etc. Everywhere you have runtime dynamic behavior, the need to check for a lot of things to be true, and everything operating through extra layers of pointers and table structs. Where the article is complaining about an extra CPU instruction here and an extra pointer indirection there, you've signed up for extra function calls and pointer indirections by the dozens. If you can convert reflection to generics it will almost certainly be a big win.
(But if you cared about performance you were probably also better off with an interface that didn't fully express what you meant and some extra type switches.)
Go is positioned to be most useful as an alternative to Java, and to C++ where performance isn't the key factor (i.e. projects where C++ would be chosen because "Enh, it's a big desktop application and C++ is familiar to a lot of developers," not because the project actually calls for being able to break out into assembly language easily or where fine-tuning performance is more important than tool-provided platform portability).
If you're using real interfaces, you should keep using interfaces.
If you care about performance, you should not try to write Java-Streams / FP-like code in a language with no JIT and a non-generational non-compacting GC.
What this article says is that a function that is generic on an interface introduces a tiny bit of reflection (as little as is necessary to figure out if a type conforms to an interface and get an itab out of it), and that tiny bit of reflection is quite expensive. This means two things.
One, if you're not in a position where you're worried about what does or does not get devirtualized and inlined, this isn't a problem for you. If you're using reflection at all, this definitely doesn't apply to you.
Two, reflection is crazy expensive, and the whole point of the article is that the introduction of that tiny bit of reflection can make function calls literally twice as slow. If you are in a position where you care about the performance of function calls, you're never really going to improve upon the situation by piling on even more reflection.
Just implement naively, then if you have performance issues identify the bottleneck.
Knowing where performance issues with certain techniques might arise is not premature optimization. Implement with an appropriate level of care, including performance concerns. Not every kind of poor performance appears as a clear spike in a call graph, and even fewer can be fixed without changing any external API.
? In most languages, it is compile-time overhead, not runtime.
> Monomorphization is a total win for systems programming languages: it is, essentially, the only form of polymorphism that has zero runtime overhead, and often it has negative performance overhead. It makes generic code faster.
The point is that the way Go implements generics is in such a way that it can make your code slower, even though there is a well-known way that will not make your code slower (at the cost of compile times).
I get to write one set of generic methods and data structures that operate over arbitrary "Component" structs, and I can allocate all my components of a particular type contiguously on the heap, then iterate over them with arbitrary, type-safe functions.
I can't fathom that doing this via a Component interface would be even as close as fast, because it would destroy cache performance by introducing a bunch of Interface tuples and pointer dereferencing for every single instance. Not to mention the type-unsafe code being yucky. Am I wrong?
FWIW I was able to update 2,000,000 components per (1/60s) frame per thread in a simple Game of Life prototype, which I am quite happy with. But I never bothered to evaluate if Interfaces would be as fast
On this basis, I don't believe your generic implementation is as faster than an interface implementation as you claim.
func (cc *ComponentContainer[T]) ForEach(f func(*Component[T])) {
for _, page := range cc.pool.pages {
for i := range page {
if page[i].IsActive() {
f(&page[i])
}
}
}
}
Still, the interface approach is a total nightmare from a readability + runtime error perspective so I won't be going back & will just hope for some performance freebies in 1.19 or later :^)It's worked with 60fps performance on a naive N^2 collision algo over about 4200 entities -- but also I tend to use a broadphase for collisions in actual games (there's an "externs" system to call to other C/C++ and I use Chipmunk's broadphase).
Look at the assembly difference between this two examples:
1. https://godbolt.org/z/7r84jd7Ya (without monomorphization)
2. https://godbolt.org/z/5Ecr133dz (with monomorphization)
If you don't want to use godbolt, run the command `go tool compile '-d=unified=1' -p . -S main.go`
I guess that the flag is not documented because the Go team has not committed themselves to whichever implementation.
Learned that watching one of Matt's cppcon talks (A+, would do again), as you can expect this is useful to compare different versions of a compiler, or different compilers entirely, or different optimisation settings.
But wait, there's more! Using the top left Add dropdown, you can get a diff view between compilation outputs: https://godbolt.org/z/s3WxhEsKE (I maximised it because a diff view getting only a third of the layout is a bit narrow).
As another datapoint I can add that I tried to replace the interface{}-based btree that I use as the main workhorse for grouping in OctoSQL[0] with a generic one, and got around 5% of a speedup out of it in terms of records per second.
That said, compiling with Go 1.18 vs Go 1.17 got me a 10-15% speedup by itself.
Where did you see this speedup? Other than `GOAMD64` there wasn't much in the release notes about compiler or stdlib performance improvements so I didn't rush to get 1.18-compiled binaries deployed, but maybe I should...
(I do expect some nice speedups from using Cut and AvailableBuffer in a few places, but not without some rewrites.)
Also, as the article mentions, Go 1.18 can now inline functions that contain a "range" for loop, which previously was not allowed, and this would contribute performance improvements for some programs by itself. The new register-based calling convention was extended to ARM64, so if you're running Go on something like Graviton2 or an Apple Silicon laptop, you could expect to see a measurable improvement from that too. (edit: the person you replied to confirmed they're using Apple Silicon, so definitely a major factor.)
The Go team is always working on performance improvements, so I'm sure there are others that made it into the release without being mentioned in the release notes.
Is there an open source CSS library or something that does this?
> Monomorphization is a total win for systems programming languages: it is, essentially, the only form of polymorphism that has zero runtime overhead, and often it has negative performance overhead. It makes generic code faster.
I'd expect without monomorphization the code should perform the same as interface{} code, perhaps minus type cast error handling overhead. That's the model where generics are passing interface{} underneath, & exist only as a type check (à la Java type erasure)
I can't imagine monomorphizing being that big of a deal during compilation if the generation is defered and results are cached.
edit: for example one could envision the compiler generates the top n specializations per generic function based on usage and then uses the current stuff non-specialized version for the rest.
Good. I saw a lot of people suggesting in late 2021 that you could use generics as some kind of `#pragma force-devirtualization`, and that would be awful if it became common.
Second, it doesn't really promise to devirtualize it. If what I have is a variable of type `io.ReadCloser` - not just implementing it, but already boxed - it's not going to be able to unbox it for me.
Third, if that was all or even primarily what we wanted out of generics it would've been much better to spend the past two years hacking on the inliner and other parts of the compiler to improve devirtualization.
I don't think it would be awful to fix the unncessary pointer indirection (which looks like it's already happened), but I don't want 10x or even 2x longer compile times just because someone is trying to get the compiler to avoid boxing. It's a tradeoff, vs. e.g. spending that time simplifying methods to get the inliner to approve of them, which is win/win.
Although the article paints the Go solution for generics somewhat negative, it actually made me more positive to the Go solution.
I don't want generic code to be pushed everywhere in Go. I like Go to stay simple and it seems the choices the Go authors have made will discourage overuse of Generics. With interfaces you already avoid code duplication so why push generics? It is just a complication.
Now you can keep generics to the areas were Go didn't use to work so great.
Personally I quite like that Go is trying to find a niche somewhere between languages such as Python and C/C++. You get better performance than Python, but they are not seeking zero-overhead at any cost like C++ which dramatically increases complexity.
Given the huge amount of projects implemented with Java, C#, Python, Node etc there must be more than enough cases where Go has perfectly good performance. In the more extreme cases I suspect C++ and Rust are the better options.
Or if you do number crunching and more scientific stuff then Julia will actually outperform Go, despite being dynamically typed. Julia is a bit opposite of Go. Julia has generics (parameterized types) for performance rather than type safety.
In Julia you can create functions taking interface types and still get inlining and max performance. Just throwing it out there are many people seem to think that to achieve max performance you always need a complex statically typed language like C++/D/Rust. No you don't. There are also very high speed dynamic languages (well only Julia I guess at the moment. Possibly LuaJIT and Terra).
> Ah well. Overall, this may have been a bit of a disappointment to those who expected to use Generics as a powerful option to optimize Go code, as it is done in other systems languages. We have learned (I hope!) a lot of interesting details about the way the Go compiler deals with Generics. Unfortunately, we have also learned that the implementation shipped in 1.18, more often than not, makes Generic code slower than whatever it was replacing. But as we’ve seen in several examples, it needn’t be this way. Regardless of whether we consider Go as a “systems-oriented” language, it feels like runtime dictionaries was not the right technical implementation choice for a compiled language at all. Despite the low complexity of the Go compiler, it’s clear and measurable that its generated code has been steadily getting better on every release since 1.0, with very few regressions, up until now.
And remember:
> DO NOT despair and/or weep profusely, as there is no technical limitation in the language design for Go Generics that prevents an (eventual) implementation that uses monomorphization more aggressively to inline or de-virtualize method calls.
> with very few regressions, up until now.
the idea that this is a regression is silly. you can't have a regression unless old code is slower as a result. which is clearly not the case. its just a less than ideal outcome for generics. which will likely get resolved.
Blowing your icache can result in slowdowns. In many cases it's worth having smaller code even if it's a bit slower when microbenchmarked cache-hot, to avoid evicting other frequently used code from the cache in the real system.
Much as with JITs (though probably with higher thresholds), issues occur for megamorphic callsites (when a generic function has a ton of instances), but that should be possible to dump for visibility, and there are common and pretty easy solutions for at least some cases e.g. trampolining through a small generic function (which will almost certainly be inlined) to one that's already monomorphic is pretty common when the generic bits are mostly a few conversions at the head of the function (this sort of trampolining is common in Rust, where "conversion" generics are often used for convenience purposes so e.g. a function will take an `T: AsRef<str>` so the caller doesn't have to extract an `&str` themselves).
I think it’s great that golang designers decided to follow Swift’s approach instead of specializing everything. The performance issues can be fixed in time with more tools (like monomorphissation directives) and profile-guided optimization.
They care about making excuses about not using Go.
If anyone can contribute, please do.
The language will suffer now with additional developmental and other overhead.
The world will continue turning.
Couldn't JIT do this?