Rust has the goal of putting as much smartness between what you type and what the compiler produces, which is a perfectly fine goal, but without a specification for the input semantics it's a pretty wobbly thing, especially for a systems programming language. A lot of bit packing code is simply not writable in rust today without immediately invoking UB, that works for now but might break with every bugfix release.
I don't know about Zig, but as for C, this take is not just wrong, but dangerously wrong. The behavior of C code isn't defined by the underlying assembly, but by an abstract machine model that may or may not match your real machine (narrator: It doesn't). As a result, the compiler can and will ignore the intention of your code when it can prove that your code would invoke UB. Good example: signed integer arithmetic and overflow tests. It is very difficult to write overflow tests for signed integer operations without accidentally invoking UB, and modern compilers will simply remove your overflow tests instead of translating them 1:1 to assembly instructions.
Rust simply lacks that mapping, yet.
However, Rust and C alike have another whole dimension to their semantics, that of Undefined Behavior, which is not reflected in the assembly, and which needs to be taken into account for unsafe code authors (in Rust) / by all programmers (in C). See for example https://www.ralfj.de/blog/2019/07/14/uninit.html for what goes wrong when you think of C as just a macro assembler.
Just today we had this nice example on the front page where initializing a variable differently completely changed the codegen for large parts of the program: https://jpieper.com/2022/08/05/debugging-bare-metal-stm32-fr...
No there isn't. "Something von neumann-ish" would have behaviour for all inputs - maybe not desirable behaviour, maybe even different behaviour on different processor revisions, but it would have behaviour. The C abstract machine doesn't.
That applies to pretty much all imperative languages that compile directly to assembly/machine code – including Rust.
Take a look at the languages that rust was influenced by (https://en.wikipedia.org/wiki/Rust_(programming_language)) those aren't languages with straightforward compilation semantics.
There is a reason why rust has a datalog engine build into the compiler (https://github.com/rust-lang/datafrog). Which is imho totally rad and awesome, but really hard to fully form a mental model of without a spec.
The fact that you point to datafrog is illustrative: while it is not actually built into the compiler today, the use case for it is borrow checking, which famously does not impact the language's operational behavior at all! It is purely a compile-time analysis that does not influence code generation.
(For example, consider that mrustc and the GCC frontend are both able to omit borrow checking entirely and yet still produce runnable binaries!)
Of course, the borrow checker is still something that you need to build a mental model for, but by design it is okay for you to get that wrong occasionally, because the result can only ever be "your program still does what you thought, but the analysis proved that in a way you did not expect" or "your program does not compile."
Same applies to the Java and .NET ecosystem, because either you swim on the surface, or you really get to know how the implementations, down to bytecode, JIT, GC and standard libraries work, and now they are full speed ahead with 6 month release schedules.
I've recently tried Zig and switched to it instantly, it's hard to explain but basically Andrew has a very good taste at picking important features and keeping the language complexity very low.
You know how it takes some time to learn borrow checker and macros and generics and traits and all weird rules of what you cannot do and then trait bounds and then it doesn't work exactly like you need, or the crate does not support something and you cannot implement it yourself, etc. etc.
So in Zig I had a hello world on day one, and the first thing I did was encode/decode ANY json messages for tagged union (which unfortunately is not supported in std but it was very easy to do it myself) and it worked! I did this the first day in entirely new language (I was not even doing C/C++ before) and it would probably take few days in rust and I'd probably mess up something and I know it wouldn't work for every case and every crate, because of orphan rule. In Zig it would work for any struct, internal or external. And also Debug, Display, Eq, Partial, all of that works automatically. That's huge!
And the worst thing is that recently, I've started using pointers again, and when I look at the code I don't see anything unsafe in the structure itself, it can be used unsafely but that is also very easy to fix in Zig because you have these explicit allocator and it's so easy to put everything in the same arena transparently, or use SegmentedList with stable pointers.
If your compiler merely translates a source line into a series of assembly mnemonics, function calls, or interpreter gotos, then the interface is the implementation. You can rely on the underlying target language to provide your program with meaning and the only people who have to care are people reimplementing your compiler for compatibility.
The moment you start talking about optimization, then this no longer works. You no longer have a correspondence between source and compiled forms of one program. You have a many-to-many relationship where one source form can be compiled into hundreds of binaries depending on how the compiler is configured, and many source forms may actually optimize to the exact same compiled form. This requires you to provide your own semantics, else compiled programs have no meaning and -O3 becomes shorthand for "make demons fly out my nose".
In the case of C they came up with a series of rules for what-not-to-do that both did not match existing language semantics and also were dangerously incomplete. There are still C programmers who insist that you can free() memory but still touch it for a "little while"[0], or access memory "off the end" of an allocation[1], for example. And ISO C still made the mistake of retaining pointers, which are a confusing mix of value and reference type. They aren't references because you are allowed to cast them to and from integers; and they can't be values because you can use them to modify other values. Because of this tension, we keep discovering new combinations of valid transformations on valid programs that cause miscompiles, and then we have to invent things like pointer provenance to fix them.
As far as I'm concerned, the only difference between Rust and C is that Rust is honest about it's cleverness. C has to pretend to be simple while also out-clevering Rust (or at least, the safe subset of Rust).
[0] Usually in an attempt to emulate automatic memory management. Manual memory management does not work when passing complex structures across an API boundary, and the only options are to either expose custom deallocators (which means no optimizations even when they are sound), tell callers how to deallocate the data (which means no changing the data), or hack the allocator to do what you really want.
[1] It works for malware developers, it should work for me, right?
That seems like an arbitrary criterion. They're values because they behave like values when copying them around etc. You can't even use them to modify other values implicitly - you still need to use * to get an actual dereferenceable/assignable lvalue out of the pointer.
My point is that because pointers act like both values and references, they are neither values nor references. This makes it impossible to soundly reason about them.
No it doesn't. "smartness" is a means, not an end or a goal.
> Hear more from Reliability Project Director
https://mobile.twitter.com/rust_foundation/status/1385310806...