[1]: https://github.com/laumann/compiletest-rs
[2]: http://erickt.github.io/blog/2015/09/22/if-you-use-unsafe/
It would be neat if we could decompose unsafe like so "unsafe[this_feature,that_feature] {}". The unqualified "unsafe" could still refer to a global "free reign", but you could opt-in to "only let me violate these specific rules." It would be a hint to maintainers and might help make the std lib and other core libraries be/remain defect-free.
Another interesting "oh shoot" w/unsafe that I'm curious about: when I intentionally/unintentionally alias two variables in my unsafe block, this will invalidate assumptions made elsewhere in safe code. This is my unsafe block's bug, but it seems like something that could take a good while debugging to attribute back to my unsafe block. I don't think there's a good resolution to this one other than perhaps documentation/best practices.
Note that we sort of made a "new" kind of unsafe with the UnwindSafe trait: https://doc.rust-lang.org/std/panic/trait.UnwindSafe.html
That's probably how we intend to solve these kinds of problem in the future.
Re: aliasing -- if it's a serious enough problem, one of two things will happen:
* Someone will develop a version of asan/ubsan for Rust.
* The Rust devs will be forced to reduce the extent to which they apply alias analysis by default (possibly with a flag to opt into it). At least temporarily.
The rust devs have backed off optimizations in the past when they break stuff in the ecosystem (struct layout optimization). But they also work with the affected devs to fix those bugs so they can turn the optimization on.
This already happened (japaric [1]). But ASan won't save you from a bug due to optimization-because-I-assumed-these-locations-dont-alias (maybe TSan might?).
[1] https://users.rust-lang.org/t/howto-sanitize-your-rust-code/...
You are correct: it's possible to write nefarious code inside an 'unsafe' block then only suffer its effects outside of it, and Rust has to document that fact. The Nomicon, mentioned in the blog post a whole bunch, points this out early on:
> 'unsafe' does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy.
[ https://doc.rust-lang.org/nightly/nomicon/working-with-unsaf... ]
I sometimes feel the same way, but remember that the `unsafe` keyword only unlocks four additional features:
1. Dereferencing a raw pointer
2. Calling an unsafe function or method
3. Accessing or modifying a mutable static variable (and this might conceivably even be removed entirely someday)
4. Implementing an unsafe trait
It's unclear to me how to make this any more fine-grained such that annotating the "kind" of unsafe you're using would be useful and enforceable by the compiler (which is crucial, because otherwise why not just use a comment?).
In practice I think really the only "distinction" in unsafe Rust that I want is the ability to distinguish unsafe blocks that exist only to call external C code.
Mutable static variables removed or the unsafety of accessing them? Didn't Rust, at one early point, not allow mutable global variables?
As part of the unsafe code guidelines effort, there's been talk of adding an 'unsafe checker' mode to rustc, analogous to valgrind or Clang's AddressSanitizer, which would alter code generation to add checks that could catch many classes of incorrect behavior at runtime. (This would have a high performance cost and would be intended as a debugging tool.) One of the things it would probably do is keep a global map of all live references, and complain if references are created that break the rules, e.g. a mutable reference is created to something that already has a (mutable or immutable) reference somewhere else in the program. Thus it could catch the kind of bug you mentioned.
Of course, you would have to remember to run the checker, and as a dynamic rather than static analysis it would only catch errors that are actually exhibited at runtime (so it probably wouldn't catch the MutexGuard example from the original blog post, unless there was some real code that raced on a MutexGuard). Still, in practice it should help a lot with ensuring that unsafe code doesn't break the rules.
Edit: Niko talked about this in a blog post in February. He proposes a somewhat more complex tracking system than the global list of references I mentioned:
http://smallcultfollowing.com/babysteps/blog/2017/02/01/unsa...
if implemented it would probably outperform haskell (given that monad transformers in haskell have runtime overhead, unfortunately).
edit: Go also has zero-sized types (struct{}), so I wonder if this is also possible? Probably not, I don't think, since the compiler doesn't see through interfaces.
[Talks will be recorded, but also tickets are on sale right now if you want to be there in person!]
No. It specifically uses Rust's generics system, and the fact that generics are monomorphized at compile time, whereas Go interfaces are not.
C++ templates can be used in similar ways.
In a lot of ways it makes me trust Rust even more, because there is a deeper understanding of exactly how these guarantees are made.
Additionally, not having the facility for low-level/unchecked code just means that things like optimised data structures/memory management/hardware interaction get implemented either in the compiler or in other languages. The former is much harder to reason about and to modify: one is essentially writing code that generates compiler IR, which is more annoying and error prone that both just writing the code directly and just writing the IR directly (one way to think about this is the compiler is one big `unsafe` block). The latter is unfortunate because it results in impedence mismatches when doing the FFI calls both semantically and with performance, and it also means that code doesn't get to benefit from the usual Rust safe checks and high level features (like ADTs) that are all still available inside `unsafe` blocks.
let p = 0xb8000 as *mut u8;
VGA drivers use the memory mapped at 0xb8000 to drive the device. This creates a pointer, p, at that address.In order to demonstrate this is safe (okay so unsafe isn't in this example, creating p is safe, but writing to/reading from it is not), a language would have to know:
1. That your code is running in kernel mode, that is the entire concept of ring 0 vs ring 3.
2. That the VGA spec specifies that location in memory.
Yeah, in _theory_, you could have a language that does this, but that'd tie your language so, so, so deeply to each platform, that it's not feasible.
This can be extrapolated to all kinds of other low-level things.
That need not be the case though. You could have a kernel side allocator that sets up the MMU to map that memory to a pointer that you return which lives in the space of the process. The MMU would take care of the required arithmetic to access the memory at its actual location using an offset.
That way you can map resources from real addresses into arbitrary addresses on the user side.
I think the correct term for this mechanism is 'system address translation'.
Like, when you compile for x86 there are a bunch of rules that aren't generally safe, but on that platform they are.
* All unsafe operations don't exist.
* All unsafe operations exist, but the literal unsafe keyword and its machinery doesn't exist
The latter is how most ostensibly safe languages work. See Haskell's UnsafePerformIO, Swift's UnsafePointer, and Java's JNI for 3 examples off the top of my head.
The former is just a really gimped language that would have been a pain in the neck to implement libraries for (see other replies for examples).
A lot of built-in constructs uses unsafe under the hood. Vector (Rust's dynamic arrays) does memory allocation / resizing under the hood for example, and there's no safe way to do it, unless some safer array allocation primitive is exposed.
Also things like mem::swap(x, y) cannot be implemented at all with safe rust. in order to perform swap, you need a temporary variable. That temporary variable would be uninitialized, which Rust does not allow.
Note that in c++ it invokes copy constructor - http://www.cplusplus.com/reference/algorithm/swap/ - but Rust's mem::swap works for types that does not implement the Copy trait (Rust's equivalent to copy constructor).
Same can be said about slice splitting functions, which is primarily used to work around Rust's borrow checker.
IIUC, technically, the bug was a missing implementation of a trait and the result was a data race (which I (weirdly, maybe) don't think of as memory safety).
In other words, TL;DR: magic is neat, except that sometimes it really sucks.
I may have misunderstood Ralf's bug. Is it really the case that MutexGuard<T> was seen as Sync if T was Send, rather that Sync? Wouldn't that be a bigger problem than just the case of MutexGuard?
One framing of the MutexGuard problem is that the type wasn't declared in a way that reflected its semantics best, although it is clearly unfortunate that doing this is more complicated than the incorrect way.
So T: Sync if &T: Send. MutexGuard internally contains a &Mutex<T> (and Poison, but that's irrelevant here). T was Cell<i32>. If you follow the rabbit hole, you'll net out that T was Send, and therefore MutexGuard was Sync.
You could imagine an alternate world where MutexGuard is Send, to allow transfer of ownership of a lock to a different thread while keeping the mutex locked. But that would mean &MutexGuard is Sync, WTF?
You understood the bug correctly, but its not a bigger problem. You probably are lacking context on auto traits, but this blog post contains the context you need if you read it again.
"This means that the compiler considers a type like MutexGuard<T> to be Sync if all its fields are Sync."
Is that true in general? Is a type thread safe if all its fields are thread safe individually?
2D Second Life clone, with full programming capability with built-in database - 2 megabytes. Solid ASM. Rust can't even come close, and never will.
And take 100x longer to develop :)