https://www.youtube.com/watch?v=CuD7SCqHB7k
Midori and Rust have several striking similarities, and Microsoft's recent uptick in interest in Rust (and Rust-like languages) bodes well for improved software quality.
Right. I've been saying that for, sadly, decades now, ever since my proof of correctness days. I sometimes word this as the three big questions: "how big is it", "who owns it", and "who locks it". C is notorious for totally failing to address any of those, which is the cause of many of the troubles in computing.
Then, for way too long, we had the "but we can't afford memory safety" era, or "Python/Java/Javascript/whatever are too slow". It took way too long to get past that. Ada was supposed to address safety with speed, but it never caught on.
Rust looked promising at first. Huge step in the right direction with the borrow checker. But Rust was captured by the template and functional people, and seems to have become too complicated to replace C and C++ for the average programmer. We have yet to see the simplicity and stability of Go combined with the safety of Rust.
Duffy makes the point that slices are an important construct for safety. Most things expressed with pointer arithmetic in C are expressible as slices. I once proposed slices for C [1] but the political problems outweigh the technical ones. I'd like to see a C/C++ to Rust translator smart enough to convert pointer arithmetic to safe slice syntax. I've seen one that just generates C pointers in Rust syntax, which is not that useful.
The Midori people seem to have gone through the effort of understanding why "unsafe" code was being used, and tried to formalize that area to make it safe again. When I looked at Rust libraries a few years ago, I saw "unsafe" too often, and not just in code that interfaces with other languages.
Duffy writes, in connection with casting "For example, we did have certain types that were just buckets of bits. But these were just PODs." (Plain Old Data, not some Apple product.) I referred to these as "fully mapped types" - that is, any bit pattern is valid for the type. True for integers and raw characters. Not true for enums, etc. One way to look at casts is to treat them as constructors to be optimized. The constructor takes in an array of bytes and outputs a typed object. If the representation of both is identical, the constructor can be optimized into a byte copy, and maybe even into just returning a reference, if the output is immutable or the constructor consumes the input. So that's a way to look at sound casting.
Once you have that, you can separate allocation from construction and treat a constructor as something that takes in an array of bytes, consumes it, and outputs a typed object. Constructors are now isolated from the memory allocator.
For arrays, you need some way to talk about partially initialized arrays within the language. Then you can build arrays which grow as safe code.
That takes care of some of the common cases for unsafe code. It looks like Duffy's group got that far and went further. I need to read the papers.
[1] http://animats.com/papers/languages/safearraysforc43.pdf
partially initialized arrays
I agree that partially initialized data structures are important for low-level, performance-oriented programming. But it is not clear how to do this within a Rust-style type-based approach to memory safety. Naturally, one can always wrap the problematic code that deals with partially initialised data in an unsafe-block, but you can already do that in Rust. By Rice's theorem we know that we cannot have a feasible typing system that allows us always to deal with partially initialised data in a safe way, but could there be a compromise, covering most of the standard uses of partially initialised data (e.g. what you find in a standard algorithm text book like CLRS, or a standard library like C++'s)? I don't see how right now, because the order of initialisation can be quite complex, too complex to fit neatly into well-bracked lifetimes a la Rust.Have you got any ideas how to do better?
In the Pascal-F verifier, we treated array cells as if they had a "defined()" predicate. Clearly, if you have a flag indicating whether an array cell was defined, you could handle partially initialized arrays.
Then you prove that such a flag is unnecessary.
We had a predicate
DEFINED(a,i,j)
which means the entries of the array a from i to j are initialized.
Theorems about DEFINED are: j<i => DEFINED(a,i,j) // empty case is defined
DEFINED(a,i,j) AND DEFINED(a,j+1,k) => DEFINED(a,i,k) // extension
DEFINED(a[i]) => DEFINED(a,i,i) // single element defined
DEFINED(a,i,j) AND k >= i AND k <= j => DEFINED(a[k]) // elt is defined
Given this, you can do inductive proofs of definedness as you use more slots in an array that started out un-initialized.To do proofs like this, you have to be able to express how much of the array is defined from variables in the program. Then you have to prove that for each change, the "defined" property is preserved. Not that hard when you're just growing an array. You can construct data structures where this is hard, such as some kinds of hash tables, but if you can't prove them safe, they probably have a bug.
This was new 40 years ago, but many others have been down this road since. Rice's theorem isn't a problem. That just says that you can construct some undecidable programs, not that all programs are undecidable. If your program is undecidable or the computation required to decide it is enormous, it's broken. Microsoft puts a time limit on their static driver verifier - if they can't prove safety after some number of minutes, your driver needs revision. Really, if you're writing kernel code and you're anywhere near undecidability, you're doing it wrong.
[1] http://www.animats.com/papers/verifier/verifiermanual.pdf
Rust moves system programming up. At the same time, C# moves application development down. Interesting.
This would have been interesting, but it's not too late. Correct me if I'm wrong, but there isn't anything stopping Microsoft from releasing this research to the public today. It seems like something the "new" Microsoft of 2020 would be likely to do.
It has to be said, though. It's more like "source available", not open source. I can't remember what the license is specifically, but it's not a proper open source license.
You could see it, build it, and play with it, but you couldn't do anything else with it. No sharing changes, no derivatives, etc.
Singularity would be a neat place to start working on a managed code OS, though.
I would prefer that they properly open source Midori.
That said, Midori influenced later many team, including the .NET one.
Related from 2016: https://news.ycombinator.com/item?id=11054912
https://news.ycombinator.com/item?id=10871222
You knew who they were because they wouldn't stop talking about Midori. It's a neat project, I liked hearing the stories.
Turns out when you have a project staffed with extremely high up engineers and no real production deployment, it's easy to keep up engineering standards. The idea that a normal team with normal issues wasn't as good seemed to really upset one of the midori people I worked with.
I definitely recall seeing passionate discussions with engineers whose levels ranged from 62 to 69+ where the technical merit of the arguments was the gold standard, not what the most senior engineer said. In contrast, my experience working with partner and DE level engineers in other parts of MS was that of "what I say goes"