totally_safe_transmute, Line-by-Line (2021) (opens in new tab)

(blog.yossarian.net)

80 pointsiafisher2y ago40 comments

40 comments

31 comments · 8 top-level

Pesthuf2y ago· 7 in thread

Why don't the safe file I/O operations panic when /proc/self/mem is opened for writing? I understand why they don't want to make all of File I/O unsafe just for edge cases like this, but shouldn't this be handled at runtime?

pcwalton2y ago

Because you could just do the Rust equivalent of system("dd of=/proc/myprocess/mem ...") instead, so it would be security theater. Memory safety just isn't a part of the default Unix model.

Note the emphasis on "default" above; you can use the Linux sandboxing features such as seccomp-bpf to build a sandbox which is truly memory-safe, closing this hole. The OS is in charge of the features it exposes, and Rust can't do much about that.

Note also that the existence of totally_safe_transmute doesn't mean that Rust's memory safety features are pointless. Empirically, memory-safe programming languages result in far fewer memory safety vulnerabilities, because they make exploitation way harder.

dullcrisp2y ago

Okay but why doesn’t Rust set up an LLM to analyze all output of the process and if it determines that the process is trying to communicate to the outside world that it intends to do something memory unsafe it pani

1 more reply

kibwen2y ago

I've considered proposing this before, but presumably it's a cat and mouse game. Can the Rust stdlib reliably detect writes to /proc/mem in the face of links and raw FDs? And it probably should be reliable, because nobody writes to /proc/mem by accident.

And even then, it doesn't help with whole-system security when every program in every other language on your system, including safe ones like Java and Python, have the same capability. (Although if I'm wrong that there's no precedence for languages attempting to block this, I'd love to see the prior art.)

amluto2y ago

It’s not just /proc/sys/mem. One could also modify executable files and cause all manner of other mayhem. At the end of they, if you have access to all ambient privileges available to your process, you can use them. Trying to specifically block memory safety violations would be a bit odd.

1 more reply

pornel2y ago

Rust is not a sandbox language like JS. It only catches accidental programming errors, and it's improbable that someone would write a hack via /proc/self/mem by mistake.

h4x0rr2y ago

It's hard to do by accident and is pretty unix specific. Also, what if you really want do access this and it panics?

kibwen2y ago

> Also, what if you really want do access this and it panics?

That's the easy part, you'd provide some OS-specific API for getting this file handle, call it `std::os::fs::proc()` or thereabouts, and make it an `unsafe` function. The "hard to do" is the bigger problem, because if you're providing an unsafe alternative then you'd like to plug all the holes in the safe interface, which AFAIK is non-trivial.

pcfwik2y ago· 3 in thread

Also possible to do directly in the "safe" type system, without messing around with /proc/mem: https://zyedidia.github.io/blog/posts/5-safe-transmute/

saghm2y ago

That is certainly a strange bug! It took me several minutes to wrap my head around, but from if I'm reading correctly, `transmute_obj` shouldn't expect to be passed in a U because `<T as Object<U>>::Output` should be using `Object<U>::Output`, but instead it's using `T::Output`. I need to mess around with this later on my computer because I'd expect that `<T as Object<U>>::Output` accepting `T::Output` means that it would _not_ accept `Object<U>::Output`, and I'm super curious what error it would give if it were passed that instead...

zyedidia2y ago

The Rust team did a deep dive on the bug in 2020, which has some more details that might be helpful to understanding what's going on: https://github.com/rust-lang/lang-team/blob/master/design-me....

juped2y ago

in fact way safer, since it doesn't rely on the unsafe-riddled std::fs!

jiggawatts2y ago· 3 in thread

This is cute, but I hope it never turns up in any real codebase!

There’s an updated version with Windows support and better performance: https://github.com/John2143/totally-speedy-transmute/

What worries me is this macro, which “smuggles” the unsafe keyword past the forbid(unsafe_code) flag: https://github.com/John2143/totally-speedy-transmute/blob/ma...

In my mind, this kind of capability makes Rust crate safety scanning and associated metadata worthless as currently implemented.

Package management tools ought to store code instead of binaries, and perform safety checks to via instrumented compilers.

kibwen2y ago

> In my mind, this kind of capability makes Rust crate safety scanning and associated metadata worthless as currently implemented.

If you wanted to backdoor a Rust program, you wouldn't need the `unsafe` keyword at all. And if you want to use unsafe code, that's fine, plenty of crates use unsafe code without anyone being up in arms about it (e.g. the regex crate). This is a party trick rather than something to be concerned about; at the end of the day either you're auditing your dependencies (in which case this would stick out like a sore thumb) or you're not (in which case there are far easier ways to pwn you).

tux32y ago

You can always smuggle unsafe past the compiler, it can't stop you even in principle.

The totally safe transmute is at runtime, so an instrumented compiler cannot detect it (halting problem is in the way). You'd need runtime instrumentation of your binary. And even then, it's wildly impractical.

If you let an application interact with the environment, tomorrow Linux or Windows could add a new magic file, or a special COM call, or whatever it is that creates unsafe. Rust can't have a complete list of all the unsafe things that are outside of its control.

What you probably want is a runtime VM, like WASM.

jiggawatts2y ago

> You can always smuggle unsafe past the compiler, it can't stop you even in principle.

The linked updated library uses a different method: it literally smuggles the "unsafe" keyword past the safety checks by removing the space character from "un safe".

This can and should be caught by the compiler -- it has full access the syntax tree at every intermediate stage of compilation! Instead, the Cargo tool and the rustc compiler are simply keyword-searching for the literal string "unsafe", and are hence easily fooled.

Note that this updated method is not the same thing as the Linux process memory mapping and doesn't rely on OS APIs in any way. It is a purely compile-time hack, not a runtime one.

What I'd love to see is an healthy ecosystem of open-source packages that are truly safe: using a safe language, without any escape hatches, etc...

E.g.: Libraries for decoding new image formats like JPEG XL ought to be 100% safe, but they're not! This hampers adoption, because browsers won't import huge chunks of potentially dangerous code.

Now we can't have HDR photos on the web because of this specific issue: The Chromium team removed libjxl from the code base because of legitimate security concerns. A million other things are also going to be "unsupported" for a long time because of the perceived (and real!) dangers of importing third-party dependencies.

We'll be stuck with JPG, GIF, and PNG forever because there is no easy way to share code that is truly safe in the "pure function with no side-effects" sense.

PS: This type of issues is also the root-cause of issues like Log4J and various XML decoder security problems. By default, most languages allow arbitrary code even for libraries that ought to perform "pure transformations" of one data format to another, such as reaching out to LDAP servers over TCP/IP sockets, as in the case with Log4J. Similarly, XML decoders by default use HTTP to retrieve referenced files, which is madness. Even "modern" formats like YAML make this mistake, with external file references on by default!

> What you probably want is a runtime VM, like WASM.

Sure, that's one way to sandbox applications, but in principle it ought to be entirely possible to have a totally safe ahead-of-time compiled language and/or standard library.

Rust is really close to this goal, but falls just short because of tricks like this macro backdoor. (It would also need a safe subset of the standard library.)

1 more reply

api2y ago· 3 in thread

This is a really weird hack to say the least. More like a flex showing that the author can implement transmute without unsafe than something you’d really use.

throwaway06652y ago

The point is likely to show that Rust's "safety" is not absolute and it's possible to do lots of silly stuff with "safe" code.

api2y ago

Absolute safety would require a totally managed runtime with least privilege, not a 1970s Unix derivative.

lifthrasiir2y ago

That's true but this particular crate exists to show an unevitable hole in the definition of memory safety, while you can do lots of stupid things (like, `rm -rf /`) only with absolutely safe code.

kazinator2y ago· 3 in thread

C doesn't provide any reinterpretation operator, and the C++ one's name is a misnomer.

Casts are conversion: a new value is produced based on an existing one.

Reinterpretation requires a value to be in memory, and to be accessed using an lvalue of a different type. Most situations of this kind are undefined behavior.

Joker_vD2y ago

   double transmute_int_to_double(int x) {
       double result = 0;
       memcpy(&result, &x, sizeof(int) < sizeof(double) ? sizeof(int) : sizeof(double));
       return result;
   }

is not a UB, and it doesn't even necessarily touch any memory:

    transmute_int_to_double:
        movd    xmm0, edi
        ret

[0] https://godbolt.org/z/czM3eh8er

kazinator2y ago

> is not UB

Not by my interpretation of n3096 (April 2023 ISO C draft).

> doesn't even necessarily touch any memory

The abstract semantics calls for memory being touched. Data flows that go through memory in the abstract semantics can be optimized not to go through memory. UB can do anything at all.

1 more reply

lifthrasiir2y ago

You can alternatively read `reinterpret_cast` as "casting or convesion used to achieve reinterpretation", which is clearly correct.

o11c2y ago· 2 in thread

`process_vm_writev` would be simpler.

melvyn22y ago

Calling C functions like process_vm_writev from libc requires unsafe code.

juped2y ago

Calling C functions like open() requires unsafe code.

petsfed2y ago· 1 in thread

I appreciate that "totally_safe_transmute" carries some connotation that this is not a "safe" transmute, but rather a suspiciously specific denial.

lifthrasiir2y ago

So `totally-unsafe-transmute` (does not exist as of 2024-01-09T06Z) would be an actually safe transmute... right?

quotemstr2y ago· 1 in thread

/proc/self/mem is the moral equivalent of `unsafe`. Of course you can do arbitrary things with it. Why would anyone be surprised? You could use https://man7.org/linux/man-pages/man2/process_vm_readv.2.htm.... You could fork and ptrace. You can do any number of weird things.

Every day that goes by is a day I think we should make a beeline to CHERI even when we have "safe" languages.

riffraff2y ago

For the uninitiated CHERI should be

https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

j / k navigate · click thread line to collapse

40 comments

31 comments · 8 top-level

Pesthuf2y ago· 7 in thread

pcwalton2y ago

Because you could just do the Rust equivalent of system("dd of=/proc/myprocess/mem ...") instead, so it would be security theater. Memory safety just isn't a part of the default Unix model.

dullcrisp2y ago

1 more reply

kibwen2y ago

amluto2y ago

1 more reply

pornel2y ago

Rust is not a sandbox language like JS. It only catches accidental programming errors, and it's improbable that someone would write a hack via /proc/self/mem by mistake.

h4x0rr2y ago

It's hard to do by accident and is pretty unix specific. Also, what if you really want do access this and it panics?

kibwen2y ago

> Also, what if you really want do access this and it panics?

pcfwik2y ago· 3 in thread

Also possible to do directly in the "safe" type system, without messing around with /proc/mem: https://zyedidia.github.io/blog/posts/5-safe-transmute/

saghm2y ago

zyedidia2y ago

The Rust team did a deep dive on the bug in 2020, which has some more details that might be helpful to understanding what's going on: https://github.com/rust-lang/lang-team/blob/master/design-me....

juped2y ago

in fact way safer, since it doesn't rely on the unsafe-riddled std::fs!

jiggawatts2y ago· 3 in thread

This is cute, but I hope it never turns up in any real codebase!

There’s an updated version with Windows support and better performance: https://github.com/John2143/totally-speedy-transmute/

What worries me is this macro, which “smuggles” the unsafe keyword past the forbid(unsafe_code) flag: https://github.com/John2143/totally-speedy-transmute/blob/ma...

In my mind, this kind of capability makes Rust crate safety scanning and associated metadata worthless as currently implemented.

Package management tools ought to store code instead of binaries, and perform safety checks to via instrumented compilers.

kibwen2y ago

> In my mind, this kind of capability makes Rust crate safety scanning and associated metadata worthless as currently implemented.

tux32y ago

You can always smuggle unsafe past the compiler, it can't stop you even in principle.

What you probably want is a runtime VM, like WASM.

jiggawatts2y ago

> You can always smuggle unsafe past the compiler, it can't stop you even in principle.

The linked updated library uses a different method: it literally smuggles the "unsafe" keyword past the safety checks by removing the space character from "un safe".

Note that this updated method is not the same thing as the Linux process memory mapping and doesn't rely on OS APIs in any way. It is a purely compile-time hack, not a runtime one.

What I'd love to see is an healthy ecosystem of open-source packages that are truly safe: using a safe language, without any escape hatches, etc...

E.g.: Libraries for decoding new image formats like JPEG XL ought to be 100% safe, but they're not! This hampers adoption, because browsers won't import huge chunks of potentially dangerous code.

We'll be stuck with JPG, GIF, and PNG forever because there is no easy way to share code that is truly safe in the "pure function with no side-effects" sense.

> What you probably want is a runtime VM, like WASM.

Sure, that's one way to sandbox applications, but in principle it ought to be entirely possible to have a totally safe ahead-of-time compiled language and/or standard library.

Rust is really close to this goal, but falls just short because of tricks like this macro backdoor. (It would also need a safe subset of the standard library.)

1 more reply

api2y ago· 3 in thread

This is a really weird hack to say the least. More like a flex showing that the author can implement transmute without unsafe than something you’d really use.

throwaway06652y ago

The point is likely to show that Rust's "safety" is not absolute and it's possible to do lots of silly stuff with "safe" code.

api2y ago

Absolute safety would require a totally managed runtime with least privilege, not a 1970s Unix derivative.

lifthrasiir2y ago

That's true but this particular crate exists to show an unevitable hole in the definition of memory safety, while you can do lots of stupid things (like, `rm -rf /`) only with absolutely safe code.

kazinator2y ago· 3 in thread

C doesn't provide any reinterpretation operator, and the C++ one's name is a misnomer.

Casts are conversion: a new value is produced based on an existing one.

Reinterpretation requires a value to be in memory, and to be accessed using an lvalue of a different type. Most situations of this kind are undefined behavior.

Joker_vD2y ago

   double transmute_int_to_double(int x) {
       double result = 0;
       memcpy(&result, &x, sizeof(int) < sizeof(double) ? sizeof(int) : sizeof(double));
       return result;
   }

is not a UB, and it doesn't even necessarily touch any memory:

    transmute_int_to_double:
        movd    xmm0, edi
        ret

[0] https://godbolt.org/z/czM3eh8er

kazinator2y ago

> is not UB

Not by my interpretation of n3096 (April 2023 ISO C draft).

> doesn't even necessarily touch any memory

The abstract semantics calls for memory being touched. Data flows that go through memory in the abstract semantics can be optimized not to go through memory. UB can do anything at all.

1 more reply

lifthrasiir2y ago

You can alternatively read `reinterpret_cast` as "casting or convesion used to achieve reinterpretation", which is clearly correct.

o11c2y ago· 2 in thread

`process_vm_writev` would be simpler.

melvyn22y ago

Calling C functions like process_vm_writev from libc requires unsafe code.

juped2y ago

Calling C functions like open() requires unsafe code.

petsfed2y ago· 1 in thread

I appreciate that "totally_safe_transmute" carries some connotation that this is not a "safe" transmute, but rather a suspiciously specific denial.

lifthrasiir2y ago

So `totally-unsafe-transmute` (does not exist as of 2024-01-09T06Z) would be an actually safe transmute... right?

quotemstr2y ago· 1 in thread

Every day that goes by is a day I think we should make a beeline to CHERI even when we have "safe" languages.

riffraff2y ago

For the uninitiated CHERI should be

https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

j / k navigate · click thread line to collapse