We're hoping that there are also a bunch of other interesting side-effects of enabling the usage of Valgrind for Go, in particular seeing how we can use it to track how the runtime handles memory (hopefully correctly!)
edit: also strong disclaimer that this support is still somewhat experimental. I am not 100% confident we are properly instrumenting everything, and it's likely there are still some errant warnings that don't fully make sense.
But I wonder why its not trivial to throw a bunch of different inputs at your cyphering functions and measure that the execution times are all within an epsilon tolerance?
I mean, you want to show constant time of your crypto functions, why not just directly measure the time under lots of inputs? (and maybe background Garbage Collection and OS noise) and see how constant they are directly?
Also some CPUs have a counter for conditional branches (that the rr debuger leverages), and you could sample that before and after and make sure the number of conditional branches does not change between decrypts -- as that AGL post mentions branching being the same is important for constant time.
Finally, it would also seem trivial to track the first 10 decrypts, take their maximum time add a small extra few nanoseconds tolerance, and pad every following decrypt with a few nanoseconds (executing noops) to force constant time when it is varying.
And you could add an assert that anything over that established upper bound crashes the program since it is violating the constant time property. I suppose the real difficulty is if the OS deschedules your execution and throws off your timing check...
My guess is because the GC introduces pauses and therefor nondetermism in measuring the time anything takes.
Love that they have taken this route, this is the way bootstraped toolchains should be, minimal building blocks and everything else on the language itself.
Assembly isn't that hard, those of us that grown around 8 bit home computers were writing Z80 and 6502 Assembly aged 10 - 12 years old, while having fun cracking games and setting the roots of Demoscene.
The older I get the more I value commit messages. It's too easy to just leave a message like "adding valgrind support", which isn't very useful to future readers doing archaeology.
Otherwise the relevant warnings get swamped by a huge amount by irrelevant warnings.
This is why running Valgrind on Python code does not work.
So you are confirming the problem, but treating it as if ignoring it is the solution for all?
Valgrind(-memcheck) is an extremely important tool in memory-unsafe languages.
Having said that, it saved my ass a lot of times, and I’m very grateful that it exists.
From a quick glance, it seems that Go is now registering the stacks and emitting stack change commands on every goroutine context switch. This is most likely enough to make Valgrind happy with Go's scheduler.
Also, strictly speaking all Go programs are multithreaded. The inability to spawn a single-threaded Go program is actually a huge issue in some system tools like container runtimes and requires truly awful hacks to work around. (Before you ask, GOMAXPROCS=1 doesn't work.)
I'd be interested to know why Valgrind vs the Clang AddressSanitizer and MemorySaniziter. These normally find more types of errors (like use-after-return) and I find it significantly faster than Valgrind.
I'm not sure if this will work though, will it @bracewel?
(Valgrind using a CPU emulator allows for a lot of interesting things, such as also emulating cache behavior and whatnot; it may be slow and have other drawbacks -- it has to be updated every time the instruction set adds a new instruction for instance -- but it's able to do things that aren't usually possible otherwise precisely because it has a CPU emulator!)
[1] https://github.com/cloudwego/goref
Disclaimer: I work on continuous profiling for Datadog and contribute to the profiling features in the runtime.
usually I go with pprof, like basic stuff and it helps. I would NOT say memory leak is the biggest or most common issue I see, however as time goes and services become more complicated what I often see in the metrics is how RAM gets eaten and does not get freed as time goes, so the app eats more and more memory as time goes and only restart helps.
It's hard to call it memory leak in "original meaning of memory leak" but the memory does not get cleaned up because the choices I made and I want to understand how to make it better.
Thanks for the tool!
In Java heap fragmentation is usually considered a separate issue but I understand go has a non-moving garbage collector so you can lose memory due to pathological allocations that overly fragment memory and require constantly allocating new pages. I could be wrong about this since I don't know a lot about go, but heap fragmentation can cause troubles for long running programs with certain types of memory allocation.
Beside that, applications can leak memory by stuffing things into a collection (map or list) and then not cleaning it up despite becoming "stale". The references are live from the perspective of the garbage collector but are dead from the application perspective. Weak references exist to solve this problem when you expose an API that stores something but won't be able to know when something goes out of scope. I wouldn't consider this to be common, but if you are building a framework or any kind of platform code you might need to reach for this at some point. Some crazy folks also intern every string they encounter "for performance reasons" and that can obviously lead to what less crazy folk would consider a memory leak. Other folk stick a cache around every client and might not tune the cache parameters leading to unnecessary memory pressure...
Most other widely used GCed languages don’t allow the use of arbitrary interior pointers (though most GCs can actually handle them at the register level).
It can happen when your variables have too long a lifespan, or when you have a cache where the entries are not properly evicted.
A common one I see fairly often is opening a big file, creating a “new slice” on a subset of the file and then using the “new slice” and expecting the old large object to be dropped.
Except, the “new slice” is just a reference into the larger slice, so its never marked unused.
- you can create deadlocks
- spawn goroutines while not making sure they have proper exit criteria
- use slices of large objects in memory and pass them around (e.g. read files in a loop and pass only slice from whole buffer)
- and so ondo you think they will enable Valgrind if there's no leaks?
As it is, the only way to currently handle that is with " -gcflags -m=3" or using something like VSCode Go plugin, via "ui.codelenses" and "ui.diagnostic.annotations" configurations.
In Go, never launch a goroutine that you don't know exactly how it will be cleaned up.
Don't get me wrong, I love Valgrind, and have been using it extensively in my past life as a C developer. Though the fact that Go needs Valgrind feels like a failure of the language or the ecosystem. I've been doing Rust for ~6 years now, and haven't had to reach for Valgrind even once (I think a team member may have use it once).
I realize that's probably because of cgo, and maybe it's in-fact a step forward, but I can help but feel like it is a step backwards.
At least that's why I wrote that original comment.
Which is sad because I like the language and find it useful. But a part of the community does a disservice with comments like your parent comment. It's often on the cusp of calling people who code in Go "stupid". But I digress.
I guess there's also callgrind that may be useful for Gophers.
When I used it before I was working on a Rust modules that was loaded in by nginx. This was before there were official or even community bindings to nginx.... so there was a lot of opportunity for mistakes.
I also seldom need something like this in Java, .NET or node, until a dependency makes it otherwise.
I guess maybe the failure is not the addition of it (as it's useful for people writing the bindings), but rather how happy everyone on the thread is (which means it's more useful than it should be due to a failure with the ecosystem).
This isn't so much about leaks. The most important thing that this will enable is correct analysis of uninitialised memory. Without annotation memory that gets recycled will not be correctly poisoned. I imagine that it will also be useful for the other tools (except cachegrind and callgrind).
Memcheck (the main tool) has shortcomings (very slow, does not detect all kinds of errors). Its strongest point is that it does not need an instrumented build. That can be particularly important if you have issues in 3rd party libraries that you can't build. Its other strong point is that it checks for both addressability and initialisedness at the same time.
My favourite feature is using GDB with Valgrind+vgdb. That allows you to see what memory is addressable and/or initialised from within GDB.
Apple have been making big changes that keep breaking things and Valgrind has not kept up. Louis Brunner has done an amazing job more or less single handedly managed to keep the basic flow working.
But maybe others will find a way to use it. Who knows?