undefined | Better HN

0 pointsCapricorn24817mo ago0 comments

> I'd probably hold a reference to the vector or something

In a way that only changes the unsafe block?

> make_slice's responsibility to guarantee the underlying allocation stays alive, and it didn't do that

We çan talk about the simpler code example. How can my unsafe block make sure an allocation stays alive past the unsafe block itself?

> If you're putting responsibility on code in the same function, you're not really treating it as safe code. There's an unsafe{} around the entire function body that you should have written if you were going for maximum clarity

I'm not talking about what should've been done, I'm saying this is something to be aware of on projects with multiple people who may not catch something like this. People think bugs cannot happen in safe code. I just showed it happening. The bug was not in unsafe, or in the drop, it was in println.

0 comments

4 comments · 1 top-level

Dylan168077mo ago· 3 in thread

> How can my unsafe block make sure an allocation stays alive past the unsafe block itself?

Put it in another data structure or something?

Listen, if you can't make the invariant work, then you need to change the function. An unfixable unsafe is not an excuse to allow errors

> I'm saying this is something to be aware of on projects with multiple people who may not catch something like this.

It's good to be aware but put the blame in the right place. It's the unsafe code that's actually at fault. If you are seeing corruption, look at the unsafe code first with an adversarial mindset.

> The bug was not in unsafe, or in the drop, it was in println.

N. O.

Any unsafe code that outsources invariant enforcement that affects memory safety to safe code has bugs. There's wiggle room on "outsources" but that's the only wiggle room.

Capricorn2481OP7mo ago

> It's good to be aware but put the blame in the right place. It's the unsafe code that's actually at fault. If you are seeing corruption, look at the unsafe code first with an adversarial mindset.

Yeah, that's why my first comment in this thread was "Just having unsafe in your codebase means changing code outside the unsafe block could cause UB". You can caveat that with "that's bad code", but that's the reality.

> An unfixable unsafe is not an excuse to allow errors

Nobody is saying that, you're just moving goalposts now. Nobody is saying you should allow errors, I'm telling you that everyone, including you, will inevitably have to change the code around the unsafe block, because that's what has to enforce memory at the end of the day.

You said you don't need to look at the safe code, I'm asking how would you fix the unsafe, and you haven't. That's fine, and I'm not faulting the language for it.

>> The bug was not in unsafe, or in the drop, it was in println.

> N. O.

Of course it is. That line becomes instructions that access memory freed by the previous line, drop(vec). That's called a dangling pointer. You remove println and the slice is now dropped immediately. You remove drop and the println will work. The vector is not "corrupted" by unsafe. That's not how computers work. We just lose guarantees from Rust when using unsafe, including in safe code. Doesn't mean there's a bug. That's the whole point of unsafe, is to be trusted by the compiler.

> Any unsafe code that outsources invariant enforcement that affects memory safety to safe code has bugs.

All useful unsafe code outsources invariants to safe code. If we could verify the integrity of the memory, we wouldn't be using unsafe.

Now you have been interchangeably using unsafe to mean the literal blocks and the surrounding code. But here is my point: If you are saying that "unsafe code" means "just the unsafe blocks," then yes, unsafe fundamentally relies on safe code to do the right thing.

But if "unsafe code" means "everything that must uphold the invariant," then unsafe can span your entire codebase. Which is true. Just the presence of unsafe in your code base means you're looking at UB anywhere in the call stack if people don't pay attention. And that's been my whole point of this thread. The presence of unsafe means everyone now has to pay attention not just to the safe block, but all safe code interacting with that data, especially in multi-threaded scenarios.

vacuity7mo ago

As I said here[0], although I can't speak for what @Dylan16807 intends, invariants required by unsafe code are required exactly to the extent that some code can alter the invariants (the module boundary). In this sense, Rust's unsafe is just a particular example of encapsulation, where all notions of invariants in programming have the same essence.

[0] https://news.ycombinator.com/item?id=46030407

Dylan168077mo ago

> Yeah, that's why my first comment in this thread was "Just having unsafe in your codebase means changing code outside the unsafe block could cause UB". You can caveat that with "that's bad code", but that's the reality.

I agreed with you that changing safe code could trigger the bug, but the safe code is not where the bug is.

> Nobody is saying that, you're just moving goalposts now. Nobody is saying you should allow errors, I'm telling you that everyone, including you, will inevitably have to change the code around the unsafe block, because that's what has to enforce memory at the end of the day.

When fixing a vulnerable unsafe block, you might have to redesign the unsafe API, and that might require changing some safe code.

Once you decide on an API, you will not have to change safe code. The unsafe code handles all of the enforcement. If safe code is enforcing anything, you broke the rules of unsafe.

> You said you don't need to look at the safe code, I'm asking how would you fix the unsafe, and you haven't. That's fine, and I'm not faulting the language for it.

I'm not an expert at rust. I gave a couple prose suggestions but they require redesigning the way the safe and unsafe code talk to each other. Because your original design is inherently flawed. The unsafe code cannot protect itself, so it must not be used this way. You're saying we should make the safe code protect the unsafe code, and that is not right. Unsafe code needs to protect itself.

> Of course it is. That line becomes instructions that access memory freed by the previous line, drop(vec). That's called a dangling pointer. You remove println and the slice is now dropped immediately. You remove drop and the println will work. The vector is not "corrupted" by unsafe. That's not how computers work. We just lose guarantees from Rust when using unsafe, including in safe code. Doesn't mean there's a bug. That's the whole point of unsafe, is to be trusted by the compiler.

"unsafe" means "trust me compiler, I verified this myself"

Losing the guarantee is a bug. You told the compiler it didn't need to prevent a dangling pointer via the unsafe block, that you would prevent a dangling pointer via the unsafe block, and then you didn't prevent it.

If you didn't tell the compiler to trust you, the part that wouldn't have compiled is the unsafe block. You tricked it into compiling that block, so that block is where the bug is.

> All useful unsafe code outsources invariants to safe code.

That's extremely untrue. Lots of data structures protect all their invariants in their unsafe code.

> If we could verify the integrity of the memory, we wouldn't be using unsafe.

"we" are smarter than the compiler. Unsafe is for things "we" can verify but the compiler cannot. You're not supposed to use it for unverified stuff.

> Now you have been interchangeably using unsafe to mean the literal blocks and the surrounding code. But here is my point: If you are saying that "unsafe code" means "just the unsafe blocks," then yes, unsafe fundamentally relies on safe code to do the right thing.

If you're doing things 100% properly, you will expand your unsafe blocks to include everything that verifies and upholds invariants. But even after that expansion, it's still going to be a tiny fraction of your codebase.

> But if "unsafe code" means "everything that must uphold the invariant," then unsafe can span your entire codebase.

Not if your design is competent.

> Just the presence of unsafe in your code base means you're looking at UB anywhere in the call stack if people don't pay attention. And that's been my whole point of this thread.

"if people don't pay attention" is a huge factor here. If your unsafe code is wrong then that makes it hard to write safe code. But if you go fix the unsafe code then you stop needing to worry about safe code triggering a memory error.

> The presence of unsafe means everyone now has to pay attention not just to the safe block, but all safe code interacting with that data, especially in multi-threaded scenarios.

If you did things correctly, any safe function can be ignored for memory safety. Unsafe blocks are supposed to assume that the safe code calling them is actively malicious, and make themselves impossible to misuse.

1 more reply

j / k navigate · click thread line to collapse