In Go, you get a stable version of the old data, and the garbage collector tracks that you still have a reference to it. This is safe, but confuses some people.
In Rust, the borrow checker won't let you modify the array while you have a reference to a slice of it. So you can't do this at all.
In C++, you get a mention on the Department of Homeland Security's US-CERT site.
Rust has created a weird perception that memory safety equals safety. Language is a tool and it should work with me: it is extremely important that my understanding of what the program should do aligns with what it actually does.
The way you describe go's behavior is "takes snapshot of the underlying data", which usually means "deep copy container". Taking a pointer/reference usually means quite an opposite. So it is "safe" in a sense that the pointer points to valid data, but is "incorrect" in a sense that it does wrong thing without warning.
Sure, one could argue that value-returning modification functions are a giveaway of invalidated data. But this is not C, go has reference counting and instead of "forcing" underlying array to maintain the same address it just keeps original pointer pointing to dereferenceable, but wrong data.
A slice itself is just a window into a backing array of fixed size. The slice carries three data members. The pointer to the backing array and its remaining capacity and the length of the slice data.
Typically slices are passed around by value but you can take their address and modify a "shared" slice.
The built-in append() returns a new slice by value.
What happens is simply that when appending data to a slice and there is no room in the backing array, a new backing array is allocated that the returned slice points into. The old "input" slice to append is still intact and if some code has access to it, it will look at data stored in the old backing array.
I've constructed similar utility types in C and find them quite convenient. It's very convenient to have the distinction between the backing memory (array) and a slice viewing a portion of it instead of just a dynamic array.
I think it's a bit more than that. They're also riding on the static typing trend that's happening right now, so type-safety is also part of the equation. From the website:
> A language empowering everyone to build reliable and efficient software.
> Reliability: Rust’s rich type system and ownership model guarantee memory-safety and thread-safety — enabling you to eliminate many classes of bugs at compile-time.
That means that people like me that still have a hard time with C and C++ can build efficient software using the same workflow as I'm used to in my usual "web languages" (Python and JS mostly).
Of course, since append neither guarantees nor prevents a copy, the semantics of modifying a value through a pointer to a slice element after an append are unspecified, so it is not a useful construct.
No, there is no mention of a "snapshot". You get a reference to the current backing array, which may or may not continue being used by the slice (depending on reallocations). You're pointing to the live slice backing array, and the values in it may change if someone else is manipulating the slice, up to the point where the slice backing array must be reallocated, at which point you'll continue pointing to the old backing array and be keeping it from getting GC'd.
It's not without warning, it's a well documented behavior.
Just think of slices as "immutable", "pass-by-value" data structures (with a relatively efficient implementation) and everything falls into place.
Mutating them in any way is actually a special case that you do only for performance reason (i.e. you can pre-allocate and fill if you know the size ahead of time) but - as always - you try to keep those abstracted away and to the minimum.
There’s no need for Go to copy anything in the circumstance the OP described. It just doesn’t shrink the underlying array.
slice = append(slice, item)
By merely typing this all the time when you add elements, you intuitively understand that appending can potentially re-allocate the slice data in a totally different location, so any pointers you have taken before the resize are not guaranteed to be pointing to items in the new slice; just the old slice. struct intSlice {
int* addr;
int len;
int cap;
};
The memory at addr is not owned by the slice. All the slice operations are simply notation for manipulating the struct. Go's garbage collection makes the whole thing work well.This can be confusing if you're used to C++'s std::vector (which owns the memory) or Python's slices. Go's slices are a shallow pointer/length system exactly like is used in C all the time. For example:
void sort(int* addr, int len);
becomes func sort(a []int)
A Go slice is just a formalization of C's pointer/length idiom, with terse notation for manipulation.No slices, no problems
If you need to work with a part of a string, you can make two ordinary integer variables for offset and length
When you take a pointer to a slice, you get a pointer to the current version of this tuple of information for the slice. This pointer may or may not refer to a slice that anyone else is using; for instance:
ps := &s
s = append(s, 50)
At this point, '*ps' may or may not be the same thing as 's', and so it might or might not have the new '50' element at the end.
No, *ps will always be the same as s, because ps is a pointer so it carries no information other than the address of s.The author seems to have failed to distinguish the operation of copying a fat pointer (which opens the possibility of divergence) and the operation of the taking the address of a fat pointer (which involves no copying, so divergence is not possible - where would the divergent version be stored?).
See this code snippet: https://play.golang.org/p/tdb-O8a6hDN
s is a local variable (or global, doesn't matter). ps simply points to that local variable. You can modify the local variable all day long and ps will still point to it, not some old version of it.
To be clear the author is incorrect.
In C++, what happens is that the "iterators are invalidated" when you add something to an array. This is CONSTANTLY a source of bugs and frustration for new programmers. In C++, it may yield results or crash your program, and you are never sure quite which will happen. The best you can do (as a senior engineer) is design your software to avoid ever creating this situation in the first place and throw address sanitizer at things to try and catch them when they arise. The difference with Go is that in Go this will never result in a memory error.
Strictly speaking, the situation in Go is way better. I will take "incorrect behavior, but not a memory error" over "memory error" any day of the week.
We may forget what it's like for new programmers, but for those of us who hang out on Discord channels, Stack Overflow, and Reddit giving people help with programming, simple things like iterator invalidation are a major pain point.
"You have a memory error in your program", I say to someone. "Now that you know that you have a memory error, it is probably your highest priority to find and fix this error." And now you start walking someone through the steps of finding and fixing a memory error, which is nontrivial. You'll tell them about Address Sanitizer, GDB, and Valgrind, and you'll wish them luck.
C and C++ are "It is very easy to make this mistake which leads to undefined behaviour and a maybe incorrect program."
Go is "It is very easy to make this mistake which leads to a maybe incorrect program."
Yes, better! But the problem is still there. I much prefer the Rust solution where there is no common mistake.
"I have this code, you see... and it takes a couple mutable references into an array... how do I translate this into Rust?"
There is no one-size-fits-all answer to that question. The code may be correct in C or C++, but the Rust type system may give you one hell of a hard time proving that it is correct to the Rust type system's satisfaction... so you refactor your code completely, or you use integer indexes into arrays rather than references, or you use unsafe code...
I've written some amount of Rust code at this point. About half of the time, when I write a project in Rust, there comes a point at which I'm fighting with the type system. I feel like this should stop happening, at some point.
Incidentally, this is equally true of creating additional pointers or slices pointing into an growable array. They aren’t safe after the next append(). If you grow an array then you need to refer to its elements using array indices.
But if you have a fixed-length array, or between appends, you can use both pointers and slices to point to parts of it, and it will work fine,
This all works the same as C if you think of a slice as a glorified pointer. If you’re thinking of a slice as a JavaScript array then you’ll have trouble.
The problem of Go is that it has you uses slices as that as well as actual slices, there is no vector type. So the confusion is very much understandable and to be expected.
I think it works in the same way as a pointer to std::span in C++. (Or pointer to std::string_view with the exception that std::string_view doesn't allow modification of the elements.)
I guess the difference is that std::span doesn't let you append to the backing array through the std::span directly. So with C++ you have to write more code which makes it clearer what's happening.
Go is unusual in not having a equivalent of vector.
With both slice and std::span, = does a shallow copy of just 2 or 3 pointers.
With std::vector = does a deep copy of every element, you have 2 distinct backing arrays.
Go forces you to use copy() and append() and rely on the garbage collector to make a slice fill the roll of std::vector. IMO it leads to some confusing code.
So if you got a pointer to a vector element, it now points to garbage.
Ah, I would beg to differ!
You should never be taking pointers to a dynamically resizable array, in any language. (Well, caveat, its fine if you do it only for a time period where you know the array won't be growing.) The whole point of a dynamically resizable array is that its addresses can change!
If you did this in C++, you'd get undefined behavior. In Go you get "safe" but probably-not-what-you-wanted behavior. In Rust it simply wouldn't be possible (w/o unsafe), and you'd have to use indices (which is the correct thing to do, in any language).
I kinda know enough now to avoid this, but I have to be careful and remind myself it's a possibility.
I'd love some built-in method to be able to tell whether altering a value in slice A will also alter the value in slice B (i.e. whether A and B are referring to the same backing array). As far as I'm aware there's no easy way of doing this in Go.
> To programmers from other languages, such as C or C++, the concept of pointers to dynamically extensible arrays seems like a perfectly decent idea
Write Go in Go, don’t write C in Go. (Which applies to every language, tbh.)
The GC only operates on “dead” allocations (afaik it remains non-moving) so it’s not a concern for now.
Go’s map is (AFAIK) a pointer to the hmap stucture where everything happens, so it does behave like a reference type (all updates in the callee will be visible to the caller) but even then it can be useful to have a pointer to one so you can reset it without mutating it in-place. While that can be a bit weirder, it is also much safer. Especially given Go’s maps are not thread-safe (and in fact not memory-safe under concurrent updates).
Pointers to "fat pointer" types are sometimes needed, just like you sometimes need pointers-to-pointers.
One pitfall is when getting a slice by value in a function. You cannot be sure that someone is not going to pass you a slice into a buffer that they themselves use, so you have to be careful when appending - someone might be using that buffer and you’ll be writing over it.
Obviously not given Go was never simple in the first place. Go was built to be easy — for a certain value of easy.
Simple tools are often not easy, and simple programming languages are definitely not easy: they tend to be built out of a small set of very powerful concepts which are directly exposed to the language user, said language user has close to the power of the language designer in building abstractions. Lisps, and Smalltalks and Forths are simple, which means they are mind-bending and not only can you build what you want out of them (hello turing equivalence) you can build how you want.
And of course the simplest of languages (the turing tarpits) are barely usable at all.
A simple language, e.g. C or lisp, simple in that their grammar is simple, are definitely less easy for the programmer, than say Go. But C is not simple as an experience, since it forces the dev to mentally complect so many concepts in order to get things done: macros, memory management, etc. Lisp is complex in a different way: metaprogamming, DSLs, and deep abstractions are the norm. So simplicity/complexity tends to be something of a whack-a-mole. It's a lower bound, much like the uncertainty principle; you can always add complexity.
Go makes a lot of choices that try to really optimize the user_complexity * language_complexity product.
- If you don't know where a slice came from (such as getting it passed as an argument), treat it as read-only. Any exceptions should be documented.
- If you're going to mutate a slice, make it a private member of a struct and don't give out direct access to it.
I read this immediately as "create a new copy of the original slice with one additional element", so I presumed that was the case. It would actually be shocking the opposite, if I could end up modifying the original one (before the append) with a pointer to the new s, which seems to be the case!
Big gotcha there: treat slices as stateful at all time.
Since it has an assignment operation, it must be creating something new, otherwise it would have been a method of the slice itself
EDIT: I just realized the gotcha is not there at all, Go would consider the first slice to be of N length and the second slice of length N+1. Comparing the two slices would give an error at some point because one is shorter than the other, so the fact that the address changes or not is irrelevant. However I can see this becoming problematic with pointers, which proves the point of the article.
Edit: we would both learn something if you offered a counter example instead of downvoted.
In C++ this is UB, which is bad, but in keeping with the rest of the language.
In Rust, the compiler will not allow you to do any operation that would re-allocate the backing store whilst there are outstanding references into it.
In most other languages (eg. Java, C#, python, etc.), you can't get a pointer/reference to an array index, only a pointer/reference to the item at that index at the time you looked.
Go's decision here is especially weird given that this same thing is seemingly prevented for maps (why the inconsistency?).
Given the three goals of memory-safety, "simplicity" and performance, it's true there are not many other options Go could have chosen, but personally I think Go's interpretation of "simplicity" is incredibly warped: it's a kind of superficial simplicity that leads to programs that are much more complicated to reason about.