undefined | Better HN

0 pointschongli1y ago0 comments

Isn't it the case that once you use unsafe even a single time, you lose all of Rust's nice guarantees? As far as I'm aware, inside the unsafe block you can do whatever you want which means all of the nice memory-safety properties of the language go away.

It's like letting a wet dog (who'd just been swimming in a nearby swamp) run loose inside your hermetically sealed cleanroom.

0 comments

timschmidt1y ago

It seems like you've got it backwards. Even unsafe rust is still more strict than C. Here's what the book has to say (https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html)

"You can take five actions in unsafe Rust that you can’t in safe Rust, which we call unsafe superpowers. Those superpowers include the ability to:

    Dereference a raw pointer
    Call an unsafe function or method
    Access or modify a mutable static variable
    Implement an unsafe trait
    Access fields of a union

It’s important to understand that unsafe doesn’t turn off the borrow checker or disable any other of Rust’s safety checks: if you use a reference in unsafe code, it will still be checked. The unsafe keyword only gives you access to these five features that are then not checked by the compiler for memory safety. You’ll still get some degree of safety inside of an unsafe block.

In addition, unsafe does not mean the code inside the block is necessarily dangerous or that it will definitely have memory safety problems: the intent is that as the programmer, you’ll ensure the code inside an unsafe block will access memory in a valid way.

People are fallible, and mistakes will happen, but by requiring these five unsafe operations to be inside blocks annotated with unsafe you’ll know that any errors related to memory safety must be within an unsafe block. Keep unsafe blocks small; you’ll be thankful later when you investigate memory bugs."

uecker1y ago

This description is still misleading. The preconditions for the correctness of an unsafe block can very much depend on the correctness of the code outside and it is easy to find Rust bugs where exactly this was the cause. This is very similar where often C out of bounds accesses are caused by some logic error elsewhere. Also an unsafe block has to maintain all the invariants the safe Rust part needs to maintain correctness.

lambda1y ago

So, it's true that unsafe code can depend on preconditions that need to be upheld by safe code.

But using ordinary module encapsulation and private fields, you can scope the code that needs to uphold those preconditions to a particular module.

So the "trusted computing base" for the unsafe code can still be scoped and limited, allowing you to reduce the amount of code you need to audit and be particularly careful about for upholding safety guarantees.

Basically, when writing unsafe code, the actual unsafe operations are scoped to only the unsafe blocks, and they have preconditions that you need to scope to a particular module boundary to ensure that there's a limited amount of code that needs to be audited to ensure it upholds all of the safety invariants.

Ralf Jung has written a number of good papers and blog posts on this topic.

1 more reply

dwattttt1y ago

It's true, but I think it's only fair if you hold Rust to this analysis, other languages should too; the scrutiny you're implying you need in an unsafe Rust block needs to be applied to all C code, because all C code could depend on code anywhere else for its safety characteristics.

In practice (in both languages) you check what the actual unsafe code does (or "all" code in C's case), note code that depends on external actors for safety (it's not all C code, nor is it all unsafe Rust blocks), and check their callers (and callers callers, etc).

1 more reply

gf0001y ago

This is technically correct, but a bit pedantic.

Sure, you can technically just write your own vulnerability for your own program and inject it at an unsafe and see the whole world crumble... but the exact same is true for any form of FFI calls in any language. Is Java memory safe? Yeah, just because I can grab a random pointer and technically break anything I want won't change that.

The fact that a memory vulnerability error may either appear at no place at all OR at the couple hundred lines of code thorough the whole project is a night and day difference.

iknowstuff1y ago

No. Correctness of code outside unsafe depends on correctness inside those blocks, not the other way around

1 more reply

Someone1y ago

But “Dereference a raw pointer”, in combination with the ability to create raw pointers pointing to arbitrary memory addresses (that, you can do even in safe rust) allows you to write arbitrary memory from unsafe rust.

So, in theory, unsafe rust opens the floodgates. In practice, though, you can use small fragments of unsafe code that programmers can fairly easily check to be safe.

Then, once you’ve convinced yourself that those fragments are safe, you can be assured that your whole program is safe (using ‘safe’ in the rust sense, of course)

So, there may be some small islands of unsafe code that require extra attention from the programmer, but that should be just a tiny fraction of all lines, and you should be able to verify those islands in isolation.

steveklabnik1y ago

> allows you

This is where the rubber hits the road. Rust does not allow you to do this, in the sense that this is possibly undefined behavior. That "possibly" is why the compiler allows you to write this code, because by saying "unsafe", you are promising that this specific arbitrary address is legal for you to write to. But that doesn't mean that it's always legal to do so.

1 more reply

rybosome1y ago

I believe the post you are replying to was referring to the fact that you could take actions in that unsafe block that would compromise the guarantees of rust; eg you could do something silly, leave the unsafe block, then hit an “impossible” condition later in the program.

A simple example might be modifying a const value deep down in some class, where it only becomes apparent later in the program’s execution. Hence their analogy of the wet dog in a clean room - whatever beliefs you have about the structure of memory in your entire program, and guaranteed by the compiler, could have been undone by a rogue unsafe.

umanwizard1y ago

Rust doesn’t have classes, nor can const values be modified, even in unsafe code. (did you mean “immutable”?)

onnimonni1y ago

Would someone with more experience be able to explain to me why can't these operations be "safe"? What is blocking rust from producing the same machine code in a "safe" way?

NobodyNada1y ago

Rust's raw pointers are more-or-less equivalent to C pointers, with many of the same types of potential problems like dangling pointers or out-of-bounds access. Rust's references are the "safe" version of doing pointer operations; raw pointers exist so that you can express patterns that the borrow checker can't prove are sound.

Rust encourages using unsafe to "teach" the language new design patterns and data structures; and uses this heavily in its standard library. For example, the Vec type is a wrapper around a raw pointer, length, and capacity; and exposes a safe interface allowing you to create, manipulate, and access vectors with no risk of pointer math going wrong -- assuming the people who implemented the unsafe code inside of Vec didn't make a mistake, the external, safe interface is guaranteed to be sound no matter what external code does.

Think of unsafe not as "this code is unsafe", but as "I've proven this code to be safe, and the borrow checker can rely on it to prove the safety of the rest of my program."

1 more reply

vlovich1231y ago

Those specific functions are compiler builtin vector intrinsics. The main reason is that they can easily read past ends of arrays and have type safety and aliasing issues.

By the way, the rust compiler does generate such code because under the hood LLVM runs an autovectorizer when you turn on optimizations. However, for the autovectorizer to do a good job you have to write code in a very special way and you have no way of controlling whether or not it kicked in and once it did that it did a good job.

There’s work on creating safe abstractions (that also transparently scale to the appropriate vector instruction), but progress on that has felt slow to me personally and it’s not available outside nightly currently.

1 more reply

adgjlsfhk11y ago

often the unsafe code is at the edges of the type system. e.g. sometimes the proof of safety is that someone read the source code of the c library that you are calling out to. it's not useful to think of machine code as safe or unsafe. safety often refers to whether the types of your data match the lifetime dataflow.

pclmulqdq1y ago

The way I have heard it described that I think is a bit more succinct is "unsafe admits undefined behavior as though it was safe."

janice19991y ago

Claiming unsafe invalidates "all of the nice memory-safety properties" is like saying having windows in your house does away with all the structural integrity of your walls.

There's even unsafe usage in the standard library and it's used a lot in embedded libraries.

benjiro1y ago

Where are you more likely get a burglar enter your home? Windows ... Where are you more likely to develop cracks in your walls? Windows ... Where are you more likely to develop leaks? Windows (especially roof windows!)...

Sorry but horrible comparison ;)

If you need to rely on unsafe in a memory-safe language for performance reasons, then there is a issue with the language compiler at that point, that needs to be fixed. Simple as that.

The whole memory-safety is the bread and butter of the language, the moment you start to bypass it for faster memory operations, you can start doing the same in any other language. I mean, your literally bypassing the main selling point of the language. \_00_/

unrealhoang1y ago

So static typing is stupid because at the end of the line your program must interface with stream of untyped bits (i/o)?

Once you can internalize that you could unlock the power of encapsulation.

pdimitar1y ago

> If you need to rely on unsafe in a memory-safe language for performance reasons, then there is a issue with the language compiler at that point, that needs to be fixed. Simple as that.

It actually means "Rust needs to interface with many other systems that are not as stringent as it". Your interpretation has nothing to do with what's actually going on and I am surprised you misinterpreted the situation as hugely as you did.

...And even if everything was written in Rust, `unsafe` would still be needed because the lower you get [to the kernel] you get more and more non-determinism at places.

This "all or nothing" attitude is boring and tiring. We all wish things were super simple, black and white, and all-or-nothing. They are not.

sunshowers1y ago

What language is the JVM written in?

All safe code in existence running on von Neumann architectures is built on a foundation of unsafe code. The goal of all memory-safe languages is to provide safe abstractions on top of an unsafe core.

pjmlp1y ago

Depends on which JVM you are talking about, some are 100% Java, some are a mix of Java and C, others are a mix of Java and C++, in all cases a bit of Assembly as well.

throwaway20371y ago

    > What language is the JVM written in?

I am pretty sure it is C++.

I like your second paragraph. It is well written.

pjmlp1y ago

Depends on which JVM you are talking about, some are 100% Java, some are a mix of Java and C, others are a mix of Java and C++, in all cases a bit of Assembly as well.

1 more reply

rat871y ago

I don't think what something was written in should count. Baring bugs it should still be memory safe. But I believe JVM has ffi and as soon as you use ffi you risk messing up that memory safety.

sunshowers1y ago

Does it help to think of "safe Rust" as a language that's written in "unsafe Rust"? That's basically what it is.

LoganDark1y ago

> Isn't it the case that once you use unsafe even a single time, you lose all of Rust's nice guarantees?

No, not even close. You only lose Rust's safety guarantees when your unsafe code causes Undefined Behavior. Unsafe code that can be made to cause UB from Safe Rust is typically called unsound, and unsafe code that cannot be made to cause UB from Safe Rust is called sound. As long as your unsafe code is sound, then it does not break any of Rust's guarantees.

For example, unsafe code can still use slices or references provided by Safe Rust, because those are always guaranteed to be valid, even in an unsafe block. However, if from inside that unsafe block you then go on to manufacture an invalid slice or reference using unsafe functions, that is UB and you lose Rust's safety guarantees because of the UB.

wongarsu1y ago

If your unsafe code violates invariants it was supposed to uphold, that can wreck safety properties the compiler was trying to uphold elsewhere. If you can achieve something without unsafe you definitely should (safe, portable simd is available in rust nightly, but it isn't stable yet).

At the same time, unsafe doesn't just turn off all compiler checks, it just gives you tools to go around them, as well as tools that happen to go around them because of the way they work. Rust unsafe is this weird mix of being safer than pure C, but harder to grasp; with lots of nuanced invariants you have to uphold. If you want to ensure your code still has all the nice properties the compiler guarantees (which go way beyond memory safety) you would have to carefully examine every unsafe block. Which few people do, but you generally still end up with a better status quo than C/C++ where any code can in principle break properties other code was trying to uphold.

xboxnolifes1y ago

If you have 1 unsafe block, and you have a memory related crash/issue, where in your Rust code do you think the root cause is located?

This isn't a wet dog in a cleanroom. This is cleanroom complex that has a very small outhouse that is labeled as dangerous.

EnnEmmEss1y ago

Jason Ordendorff's talk [1] was probably the first time I truly grokked the concept of unsafe in Rust. The core idea behind unsafe in Rust is not to provide an escape from the guarantees provided by rust. It's to isolate the places where you have no choice but to break the guarantees and rigorously code/test the boundaries there so that anything wrapping the unsafe code can still provide the guarantees.

[1]: https://www.youtube.com/watch?v=rTo2u13lVcQ

andyferris1y ago

Rust isn't the only memory-safe language.

As soon as you start playing with FFI and raw pointers in Python, NodeJS, Julia, R, C#, etc you can easily loose the nice memory-safety properties of those languages - create undefined behavior, segfaults, etc. I'd say Rust is a lot nicer for checking unsafe correctness than other memory-safe languages, and also makes it easier to dip down to systems-level programming, yet it seems to get a lot of hate for these features.

johnisgood1y ago

Ada is even much more better at checking for correctness. It needs to be talked about more. "Safer than C" has been Ada, people did not know this before they jumped on the Rust bandwagon.

vlovich1231y ago

You only lose those guarantees if and only if the code within the unsafe block violates the rules of the Rust language.

Normally in safe code you can’t violate the language rules because the compiler enforces various rules. In unsafe mode, you can do several things the compiler would normally prevent you from doing (e.g. dereferencing a naked pointer). If you uphold all the preconditions of the language, safety is preserved.

What’s unfortunate is that the rules you are required to uphold can be more complex than you might anticipate if you’re trying to use unsafe to write C-like code. What’s fortunate is that you rarely need to do this in normal code and in SIMD which is what the snippet is representing there’s not much danger of violating the rules.

SkiFire131y ago

You lose the nice guarantees inside the `unsafe` block, but the point is to write a sound and safe interface over it, that is an API that cannot lead to UB no matter how other safe code calls it. This is basically the encapsulation concept, but for safety.

To continue the analogy of the dog, you let the dog get wet (=you use unsafe), but you put a cleaning room (=the sound and safe API) before your sealed room (=the safe code world)

selfmodruntime1y ago

> Isn't it the case that once you use unsafe even a single time, you lose all of Rust's nice guarantees

Inside that block, both yes and no. You have to enforce those nice guarantees yourself. Code that violates it will still crash.

CooCooCaCha1y ago

I wouldn’t go that far. Bevy for example, uses unsafe internally but is VERY strict about it, and every use of unsafe requires a comment explaining why the code is safe.

In other words, unsafe works if you use it carefully and keep it contained.

tonyhart71y ago

right, the point is raising awareness and assumption its not 100 and 0 problem

timeon1y ago

> unsafe even a single time, you lose all of Rust's nice guarantees

Not sure why would one resulted in all. One of Rust's advantages is the clear boundary between safe/unsafe.

tmtvl1y ago

Is there such a boundary? How do you know a function doesn't call unsafe code without looking at every function called in it, and every function those functions call, and so on?

The usual retort to these questions is 'well, the standard library uses unsafe code, so everything would need a disclaimer that it uses unsafe code, so that's a useless remark to make', but the basic issue still remains that the only clear boundary is whether a function 'contains' unsafe code, not whether a function 'calls' unsafe code.

If Rust did not have a mechanism to use external code then it would be fine because the only sources of unsafe code would be either the application itself or the standard library so you could just grep for 'unsafe' to find the boundaries.

cesarb1y ago

> Is there such a boundary? How do you know a function doesn't call unsafe code without looking at every function called in it, and every function those functions call, and so on?

Yes, there is a boundary, and usually it's either the function itself, or all methods of an object. For instance, a function I wrote recently goes somewhat like this:

  fn read_unaligned_u64_from_byte_slice(src: &[u8]) -> u64 {
    assert_eq!(src.len(), size_of::<u64>());
    unsafe { std::ptr::read_unaligned(src.as_ptr().cast::<u64>()) }
  }

The read_unaligned function (https://doc.rust-lang.org/std/ptr/fn.read_unaligned.html) has two preconditions which have to be checked manually. When doing so, you'll notice that the "src" argument must have at least 8 bytes for these preconditions to be met; the "assert_eq!()" call before that unsafe block ensures that (it will safely panic unless the "src" slice has exactly 8 bytes). That is, my "read_unaligned_u64_from_byte_slice" function is safe, even though it calls unsafe code; the function is the boundary between safe and unsafe code. No callers of that function have to worry that it calls unsafe code in its implementation.

steveklabnik1y ago

> How do you know a function doesn't call unsafe code without looking at every function called in it, and every function those functions call, and so on?

The point is that you don't need to. The guarantees compose.

> The usual retort to these questions is 'well, the standard library uses unsafe code

It's not about the standard library, it's much more fundamental than that: hardware is not memory safe to access.

> If Rust did not have a mechanism to use external code then it would be fine

This is what GC'd languages with runtimes do. And even they almost always include FFI, which lets you call into arbitrary code via the C ABI, allowing for unsafe things. Rust is a language intended to be used at the bottom of the stack, and so has more first-class support, calling it "unsafe" instead of FFI.

umanwizard1y ago

The point of rust isn’t to formally prove that there are no bugs. It’s just to make writing certain classes of bugs harder. That’s what people are missing when they point out that yes, it’s possible to circumvent safety mechanisms. It’s a strawman: bulletproof, guaranteed security simply isn’t a design goal of rust.

rat871y ago

My understanding is that the user who writes an unsafe block in a safe function is responsible for making sure that it doesn't do anything wrong to mess up the safety and that the function isn't lying about exposing a safe interface. I think at one point before rust 1.0 there was even a suggestion to rename it trustme. Of course users can easily mess up but the point is to minimize the use of unsafe so its easier to check and create interfaces that can be used safely

andrewchambers1y ago

It's more like letting a wet dog who you are watching closely quickly pass from your front door to the shower.

pdimitar1y ago

Where did you even get that weird extreme take from?

O_o

j / k navigate · click thread line to collapse

0 comments

timschmidt1y ago

It seems like you've got it backwards. Even unsafe rust is still more strict than C. Here's what the book has to say (https://doc.rust-lang.org/book/ch20-01-unsafe-rust.html)

"You can take five actions in unsafe Rust that you can’t in safe Rust, which we call unsafe superpowers. Those superpowers include the ability to:

    Dereference a raw pointer
    Call an unsafe function or method
    Access or modify a mutable static variable
    Implement an unsafe trait
    Access fields of a union

uecker1y ago

lambda1y ago

So, it's true that unsafe code can depend on preconditions that need to be upheld by safe code.

But using ordinary module encapsulation and private fields, you can scope the code that needs to uphold those preconditions to a particular module.

Ralf Jung has written a number of good papers and blog posts on this topic.

1 more reply

dwattttt1y ago

1 more reply

gf0001y ago

This is technically correct, but a bit pedantic.

The fact that a memory vulnerability error may either appear at no place at all OR at the couple hundred lines of code thorough the whole project is a night and day difference.

iknowstuff1y ago

No. Correctness of code outside unsafe depends on correctness inside those blocks, not the other way around

1 more reply

Someone1y ago

So, in theory, unsafe rust opens the floodgates. In practice, though, you can use small fragments of unsafe code that programmers can fairly easily check to be safe.

Then, once you’ve convinced yourself that those fragments are safe, you can be assured that your whole program is safe (using ‘safe’ in the rust sense, of course)

steveklabnik1y ago

> allows you

1 more reply

rybosome1y ago

umanwizard1y ago

Rust doesn’t have classes, nor can const values be modified, even in unsafe code. (did you mean “immutable”?)

onnimonni1y ago

Would someone with more experience be able to explain to me why can't these operations be "safe"? What is blocking rust from producing the same machine code in a "safe" way?

NobodyNada1y ago

Think of unsafe not as "this code is unsafe", but as "I've proven this code to be safe, and the borrow checker can rely on it to prove the safety of the rest of my program."

1 more reply

vlovich1231y ago

Those specific functions are compiler builtin vector intrinsics. The main reason is that they can easily read past ends of arrays and have type safety and aliasing issues.

1 more reply

adgjlsfhk11y ago

pclmulqdq1y ago

The way I have heard it described that I think is a bit more succinct is "unsafe admits undefined behavior as though it was safe."

janice19991y ago

Claiming unsafe invalidates "all of the nice memory-safety properties" is like saying having windows in your house does away with all the structural integrity of your walls.

There's even unsafe usage in the standard library and it's used a lot in embedded libraries.

benjiro1y ago

Sorry but horrible comparison ;)

If you need to rely on unsafe in a memory-safe language for performance reasons, then there is a issue with the language compiler at that point, that needs to be fixed. Simple as that.

unrealhoang1y ago

So static typing is stupid because at the end of the line your program must interface with stream of untyped bits (i/o)?

Once you can internalize that you could unlock the power of encapsulation.

pdimitar1y ago

> If you need to rely on unsafe in a memory-safe language for performance reasons, then there is a issue with the language compiler at that point, that needs to be fixed. Simple as that.

...And even if everything was written in Rust, `unsafe` would still be needed because the lower you get [to the kernel] you get more and more non-determinism at places.

This "all or nothing" attitude is boring and tiring. We all wish things were super simple, black and white, and all-or-nothing. They are not.

sunshowers1y ago

What language is the JVM written in?

pjmlp1y ago

Depends on which JVM you are talking about, some are 100% Java, some are a mix of Java and C, others are a mix of Java and C++, in all cases a bit of Assembly as well.

throwaway20371y ago

    > What language is the JVM written in?

I am pretty sure it is C++.

I like your second paragraph. It is well written.

pjmlp1y ago

Depends on which JVM you are talking about, some are 100% Java, some are a mix of Java and C, others are a mix of Java and C++, in all cases a bit of Assembly as well.

1 more reply

rat871y ago

I don't think what something was written in should count. Baring bugs it should still be memory safe. But I believe JVM has ffi and as soon as you use ffi you risk messing up that memory safety.

sunshowers1y ago

Does it help to think of "safe Rust" as a language that's written in "unsafe Rust"? That's basically what it is.

LoganDark1y ago

> Isn't it the case that once you use unsafe even a single time, you lose all of Rust's nice guarantees?

wongarsu1y ago

xboxnolifes1y ago

If you have 1 unsafe block, and you have a memory related crash/issue, where in your Rust code do you think the root cause is located?

This isn't a wet dog in a cleanroom. This is cleanroom complex that has a very small outhouse that is labeled as dangerous.

EnnEmmEss1y ago

[1]: https://www.youtube.com/watch?v=rTo2u13lVcQ

andyferris1y ago

Rust isn't the only memory-safe language.

johnisgood1y ago

Ada is even much more better at checking for correctness. It needs to be talked about more. "Safer than C" has been Ada, people did not know this before they jumped on the Rust bandwagon.

vlovich1231y ago

You only lose those guarantees if and only if the code within the unsafe block violates the rules of the Rust language.

SkiFire131y ago

To continue the analogy of the dog, you let the dog get wet (=you use unsafe), but you put a cleaning room (=the sound and safe API) before your sealed room (=the safe code world)

selfmodruntime1y ago

> Isn't it the case that once you use unsafe even a single time, you lose all of Rust's nice guarantees

Inside that block, both yes and no. You have to enforce those nice guarantees yourself. Code that violates it will still crash.

CooCooCaCha1y ago

I wouldn’t go that far. Bevy for example, uses unsafe internally but is VERY strict about it, and every use of unsafe requires a comment explaining why the code is safe.

In other words, unsafe works if you use it carefully and keep it contained.

tonyhart71y ago

right, the point is raising awareness and assumption its not 100 and 0 problem

timeon1y ago

> unsafe even a single time, you lose all of Rust's nice guarantees

Not sure why would one resulted in all. One of Rust's advantages is the clear boundary between safe/unsafe.

tmtvl1y ago

Is there such a boundary? How do you know a function doesn't call unsafe code without looking at every function called in it, and every function those functions call, and so on?

cesarb1y ago

> Is there such a boundary? How do you know a function doesn't call unsafe code without looking at every function called in it, and every function those functions call, and so on?

Yes, there is a boundary, and usually it's either the function itself, or all methods of an object. For instance, a function I wrote recently goes somewhat like this:

  fn read_unaligned_u64_from_byte_slice(src: &[u8]) -> u64 {
    assert_eq!(src.len(), size_of::<u64>());
    unsafe { std::ptr::read_unaligned(src.as_ptr().cast::<u64>()) }
  }

steveklabnik1y ago

> How do you know a function doesn't call unsafe code without looking at every function called in it, and every function those functions call, and so on?

The point is that you don't need to. The guarantees compose.

> The usual retort to these questions is 'well, the standard library uses unsafe code

It's not about the standard library, it's much more fundamental than that: hardware is not memory safe to access.

> If Rust did not have a mechanism to use external code then it would be fine

umanwizard1y ago

rat871y ago

andrewchambers1y ago

It's more like letting a wet dog who you are watching closely quickly pass from your front door to the shower.

pdimitar1y ago

Where did you even get that weird extreme take from?

O_o

j / k navigate · click thread line to collapse