Secure Rust Guidelines (opens in new tab)

(anssi-fr.github.io)

171 pointsxmmrm6y ago58 comments

58 comments

40 comments · 5 top-level

CJefferson6y ago· 21 in thread

I personally strongly disagree with:

Functions or instructions that can cause the code to panic at runtime must not be used.

First of all, they kind of dodge this later by saying "Array indexing must be properly tested" (else it can panic) -- everyone thinks they write code which is "properly tested".

Personally, I often write panicing code -- if the code gets in a state where I have no idea how to fix it, I panic. For some code it is important it does quit, but I'd much prefer more code paniced than tried to carry on and ended up creating other problems.

Also, in rust if we don't want to panic we need to never use array indexing and never use integer division, just to start.

Of course, it's fine to write code which plans to never panic, but I think it would be better to say "Have a formal 'panic plan'", where either one panics early, or tries to never panic.

mbrubeck6y ago

Panicking is, in many ways, the best-case scenario for code that contains a bug. A bug that causes a panic is much easier to find through fuzzing, property testing, even static analysis or analyzing failures in production. It also prevents the bug from "infecting" other code by allowing the program to proceed in an invalid state.

If you don't want a panic to take down the whole system, you can isolate the code in a thread and use a supervision tree, or use `catch_unwind` to let the thread perform cleanup and then continue from a known state.

Using `slice.get()` and returning `Option` or `Result` for failures that should never happen in correct code is not an improvement. It leads to the same unwinding behavior, as the failure gets passed up the call stack, but with more manually-written code, and more error-handling paths that are hard or impossible to test (because if the program is correct then they are unreachable). It infects calling functions and changes public APIs.

Clearly, the best practice is "don't write code with bugs." If your code is bug-free, then it is also panic-free. But we don't really have the tools and techniques to do that all the time in general. The second-best option may be to write code that panics when there is a bug. (People saying "write code that can't panic" are often really just saying "don't write bugs.")

(See also "crash-only software," from the Erlang school of reliability engineering.)

cesarb6y ago

> If you don't want a panic to take down the whole system, you can isolate the code in a thread and use a supervision tree, or use `catch_unwind` to let the thread perform cleanup and then continue from a known state.

Playing devil's advocate: with unwinding panics (which are necessary for these two approaches), it's harder to make sure all the data structures the thread was using are left in a coherent state. It's not as bad as exception safety in C++, but it does have some similarities. Just take a look at the tricks the Rust standard library uses to keep everything sane even if the stack unwinds (structs implementing Drop normally called SomethingOnDrop, for instance CopyOnDrop), or std::sync::Mutex poisoning.

mbrubeck6y ago

Agreed. There are definitely good arguments for using abort-on-panic, and doing isolation and recovery at the process level rather than the thread level. This is what we do with all the Rust code in Firefox, for example.

Unwind safety is a real issue. I have some personal experience with fixing panic-safety issues in unsafe Rust code:

https://github.com/servo/rust-smallvec/pull/103

I wrote a bit more about it here:

https://users.rust-lang.org/t/c-pitfalls-hard-to-avoid-that-...

1 more reply

baby6y ago

> much easier to find through fuzzing

Note that you can fuzz with debug asserts enabled.

Crash-only software is a fantastic paper that teaches another school of thought: always panic gracefully and be quick to recover. This assumes that you have a quick recovery plan for panics, if you panic, which is then fine (but not always the case).

eeZah7Ux6y ago

> Panicking is, in many ways, the best-case scenario for code that contains a bug.

The very opposite: it's among the worst behaviors, short of creating a vulnerability.

From any modern languages we should expect an exception to be raised.

mbrubeck6y ago

"Panic" is essentially the Rust name for exceptions. They are even implemented the same as C++ exceptions in terms of codegen. The real differences are cultural rather than technical, as I touch on in this thread:

https://users.rust-lang.org/t/c-pitfalls-hard-to-avoid-that-...

1 more reply

masklinn6y ago

> First of all, they kind of dodge this later by saying "Array indexing must be properly tested" (else it can panic) -- everyone thinks they write code which is "properly tested".

You're skipping half the recommendation though:

> Array indexing must be properly tested, or the get method should be used to return an Option.

emphasis mine

> Also, in rust if we don't want to panic we need to never use array indexing and never use integer division, just to start.

I mean, that's literally the block above the one you quote:

> Common patterns that can cause panics are:

> * using unwrap or expect,

> * using assert,

> * an unchecked access to an array,

> * integer overflow (in debug mode),

> * division by zero,

> * large allocations,

> * string formatting using format!.

>> Rule LANG-NOPANIC

>> Functions or instructions that can cause the code to panic at runtime must not be used.

The bit you quote is really a additional reminder that array indexing is not panic-safe.

CJefferson6y ago

Saying "properly tested" doesn't seem useful to me. What is "properly tested"? I'm sure people thing most array overflows in C libraries are "properly tested", as noone wants to cause memory corruption.

masklinn6y ago

> Saying "properly tested" doesn't seem useful to me. I'm sure people thing most array overflows in C libraries are "properly tested"

It is useful in the sense that it's for a restricted set of constructs, in the same sense that `unsafe` blocks are not outright forbidden but they should be justified and because they're restricted in span they can be more easily tested than your entire C codebase.

Since array indexing would be recommended against by default (and either iterators or `get` would be the normal way to handle it), the number of places where it is used should be small and thus easy to check for, and test extensively if not exhaustively.

heavenlyblue6y ago

What do you do when your algorithm works with array indexing and you are never supposed to handle than None within the Option?

Ie if you have indexed array out of bounds that’s a developer’s mistake and not a “condition to be handled up the stack”.

It’s ok to panic if you agree that it’s better to panic rather than end up in a state that will probably cause even more issues down the line

masklinn6y ago

> What do you do when your algorithm works with array indexing and you are never supposed to handle than None within the Option?

That's where the first clause comes into play:

> Array indexing must be properly tested

Although

> Ie if you have indexed array out of bounds that’s a developer’s mistake

Which developer would be my question.

If it's the person who reified the "algorithm [which] works with array indexing" then fair's fair, it's a bug in the implementation and should not happen, which is why the guideline specifically says:

> should never use functions or instructions that can fail and cause the code to panic

which, assuming the word is used in the RFC 2119 sense means there can be case where calling panic-ing functions is the right thing to do.

If it's the developer who calls the function implementing the algorithm, then either algorithm is failable (and thus so's the function) or the algorithm should take in more specific data structures which statically exclude failing cases. For instance `max` on an empty collection will fail, so either `max` should be failable (which e.g. `Iterator::max` is) or it should only be callable on some sort of `NonEmpty*`.

SkyMarshal6y ago

You're skipping half the recommendation though:

> Array indexing must be properly tested, or the get method should be used to return an Option.

Fwiw, for anyone who’s dabbled with Haskell this is SOP. Any result that could be undefined is returned as a Maybe (Haskell’s version of Option). If this seems odd to anyone, understanding it in Haskell will probably help understand it better in Rust too:

http://learnyouahaskell.com/a-fistful-of-monads#getting-our-...

https://en.wikibooks.org/wiki/Haskell/Understanding_monads/M...

3 more replies

baby6y ago

There’s a number of functions that panic in the std if misused, for example copy_from_slice

masklinn6y ago

Yes?

Paul-ish6y ago

I agree wholeheartedly. If we tell people "Don't write code that panics" then we collectively start writing code that assume panics don't happen. Suddenly panics become a footgun in Rust, and it isn't the safe language we want it to be.

If I'm feeling in a mood, I think we should panic all the time. We should have a "panic monkey" tool that places panics at random parts of the code, to see how panics are handled. Eg to see if they interact poorly with unsafe code.

dathinab6y ago

I can only agree diesel had (maybe still has) a serve bug around connection reuse and panics.

Panics in rust originally where a pretty nice way to have a error kennel pattern and as long as you always kill the whole error kennel in which the panic occurred and don't share mutable state between error kennels this fully save from any inconsistent state.

Sure panics across FFI boundaries can't be done nice at all, but this can still be handled by making sure there is no FFI boundary insider of an error kernel, just in between.

So I would push panic usage for error kennels, and just that. But also the unwind safe traits need an refresh: Enforce them (by linting against inconsistencies) and better defaults for what does and doesn't implement them.

For example a type which is sync but doesn't implement the referential unwind safe trait is broken/bad in 99% of cases. (Either it should implement unwind safe market or not be send). The remaining 1% are edge cases between the usage of catch unwind and thread local storage.

a13692099936y ago

You don't need unwind safety (for anything, ever); a error kennel is more commonly known as a "process"; when a panic occurs

> you always kill the whole [process] in which the panic occurred and don't share mutable state between [process]s

Processes are designed for this, including making it very hard to accidentaly share mutable state (using eg mmap(MAP_SHARED) explicitly).

1 more reply

baby6y ago

No，if you are writing secure code then a panic is a denial of service. Their point is that you should never panic, unless you do reach an unrecoverable error.

Paul-ish6y ago

It seems like, in the absence of panics, there is a dichotomy between "You should know everything that can happen to your program and account for it." and "If you hit some unknown state, carry on." The first one is impossible, the second sounds dangerous. When you get into this dichotomy, panicking seems like a reasonable third option. We should strive for the first, but avoid the second. Nobody wants their code to panic, but we need to be able to deal with unexpected states.

baby6y ago

Yeah that's what is lacking from my comment: expect panics to happen and attempt to gracefully recover whenever that happens. This is not easy though.

jerf6y ago

(This is neither direct disagreement nor agreement, but a sort of elaboration of my own.)

The practical semantics of a panic become increasingly ill-defined as the number of threads in the program and the depth of their stacks increase. I emphasize "practical" because mathematically, no matter how many threads you have or how deep you get, the semantics are perfectly well defined; it's going to do whatever it's going to do. But those mathematical semantics get more and more complicated as the program grows. If you call X() and it panics, its behavior depends on its stack above. If that X() is called in a "server" thread and it ends up terminating it (or restarting it, or anything else), it may impact any number of other threads in weird ways, even if you "handle" the panic in some manner, e.g., even if you "clean up the locks" on the way up the panic handler that still doesn't mean the program's in a "clean state".

When a programmer goes to write the aforementioned X(), it's impossible in general for that programmer to know what their "panic environment" is going to be. (In specific, it may be an unexposed/private method that does have a good idea what environment it will be called in, but in general a function does not know.) Working via panic handlers shares the responsibility between the caller and the callee in a difficult-to-describe manner. At small scales this is completely neglectable, but as you scale up, mismatches between the programmer of X() and the user of X() very gradually start accumulating, and interacting with the other mismatches, and at some point you get to the point where some junior programmer doesn't fully understand the complicated maze and just starts bashing on code until it superficially seems to work, and then you're in real trouble.

By contrast, the contract with error-type returns like Result<> or Either or whatever has a clean contract between the caller and the callee; the callee is responsible for reporting failure, and the caller is responsible for dealing with it. This kind of contract does not compose up into a complicated "panic environment", because it does not cross the stack frames like a panic/exception can. The frames don't mix with each other.

In theory it may be possible to completely exclude exceptions; in practice right now we don't really know how to do it 100% in practical languages, so regardless of what I wrote and regardless of what you may like, you will have some sort of "panic environment". However, I think you are in general better off to try to avoid complicating it as much as possible, because it gets exponentially complicated. I do mean exponentially as in 2^x and not x^2 as it is commonly abused. But the x is fairly small, which makes it easy to fail to notice. Also if you know your program is going to stay reasonably small it means it's a valid choice to just not care and use it anyhow. But if you have any reason to think your program may scale, I'd suggest trying very hard to stick to error-type handling as much as possible, as you only pay that exponential term as you add things into your panic environment.

(Also, I know that "using errors" doesn't fix the problems I mentioned, as I myself have had plenty of days where my "error based" program terminated a critical internal server and made the program "go all weird". I'm just saying it makes it better to not add the complications of a complicated panic-handling environment in addition to the already-unavoidable other tasks you have to do.)

gardaani6y ago· 12 in thread

I'm having problems fulfilling this requirement in my libs: "Crates providing libraries should never use functions or instructions that can fail and cause the code to panic."

The Rust standard library Vec, HashMap etc. can cause a panic in Rust, if the device (such as a mobile phone with a small memory) runs out of memory.

C and C++ standard libraries (malloc, std::vector, std::map..) can handle those situations by returning null or throwing an exception.

I wish Rust had some easy way to recover from out-of-memory situations when using the standard library. I have been considering writing my own out-of-memory safe Vec, HashMap etc, but it can't be the right way to do it..

_0w8t6y ago

https://www.youtube.com/watch?v=ARYP83yNAWk mentioned that the notion that C++ could handle out-of-memory was a purely theoretical wishful thinking. In practice it could not.

masklinn6y ago

Yes that is one of the primary failures of Rust at the moment: to my knowledge it currently has no good way to safely manage allocation failures (it also has serious issues with stack overflows).

This is an issue with all heap-allocating construct, not just collections but also Box or Rc.

> I wish Rust had some easy way to recover from out-of-memory situations when using the standard library. I have been considering writing my own out-of-memory safe Vec, HashMap etc, but it can't be the right way to do it..

Maybe look at the embedded space there? There might be no_std third-party libraries which handle these issues. Possibly on top of alloc as the (unstable[0] and obviously unsafe) `Alloc` trait does have a concept of allocation failure.

[0] https://github.com/rust-lang/rust/issues/32838

steveklabnik6y ago

> Yes that is one of the primary failures of Rust at the moment: to my knowledge it currently has no good way to safely manage allocation failures

So, sort of yes and sort of no.

The data structures that allocate in the standard library do not let you handle allocation failure. However, if you write your own, the global allocator lets you determine if failure happened, and then you can do whatever you want with it.

rectang6y ago

I think Rust got it right, here.

Dealing with allocation failure gracefully is hard and requires a lot of extra code.

For most applications the best default is to panic and handle it up the stack rather than pay the programming overhead of handling allocation failure explicitly in every last nook and cranny. The Rust standard library rightly optimizes for this use case.

For the embedded or critical-safety application spaces, where you really do want to handle allocation failure gracefully, you need something other than the standard library. Letting that "something" develop slowly out in the community is a good call.

cesarb6y ago

The problem with allocation failure is that it's non-local. Suppose thread A does a huge allocation of several hundred megabytes. The best case scenario is that this allocation fails cleanly; the worst case scenario is that this allocation succeeds but causes a small 1024-byte allocation in an unrelated thread B to fail (and it doesn't have to be multiple threads, this can happen even within a single thread). I don't know of any solution for this problem, other than statically reserving the memory which will be used for each part of the system.

a13692099936y ago

The horrible part is that this isn't even limited to threads; a competely separate process (even, eg, a browser JS engine running 'completely' untrusted remote code) can cause your program to hit a allocation failure, and then not have enough memory to preform nontrivial error-handling.

gardaani6y ago

Actually, I have found FallibleVec written by Mozilla [1] but I haven't found anything for other containers, yet.

[1] https://github.com/mozilla/mp4parse_fallible

baby6y ago

Honestly all C programs that fail to allocate panic as well. What else can you do?

1 more reply

Cyph0n6y ago

Oh, I never even considered this... But how would Rust ever be able to signal allocation failures in such cases?

steveklabnik6y ago

Same as any other failure; functions that may try to allocate memory and fail will return a Result<T, E> instead of just a T.

swiftcoder6y ago

Which would be pretty painful for everyone not working in a memory-constrained environment (at a guess, that's most developers). Much like std:println! panicking instead of returning an error, ergonomics are important too a lot of the time.

2 more replies

Sharlin6y ago

Eg. with Vec, every operation that may trigger reallocation should return Result.

imjasonmiller6y ago· 2 in thread

The context that I was missing at first:

> The Agence nationale de la sécurité des systèmes d'information (ANSSI; English: National Cybersecurity Agency of France) is a French service created on 7 July 2009 with responsibility for computer security [1].

I didn't know about "cargo-outdated". I liked having "npm outdated" within the JavaScript ecosystem, so I'll give this a try.

1. https://en.wikipedia.org/wiki/Agence_nationale_de_la_s%C3%A9...

jamwaffles6y ago

`cargo-outdated` is mint. If you haven't already, consider installing `cargo-edit` too! It adds `cargo add`/`cargo rm` to add/remove crates, as well as `cargo upgrade` to bump semver versions.

https://github.com/killercup/cargo-edit

masklinn6y ago

There's also cargo-crev which is pretty neat.

maeln6y ago

This is published by the ANSSI[1], National Cybersecurity Agency of France. They do quite a good job at publishing security guide.

[1] https://www.ssi.gouv.fr/en/

dathinab6y ago

I found at least one problematic section when scanning:

> The environment variables RUSTC, RUSTC_WRAPPER and RUSTFLAGS must not be overriden when using Cargo to build project.

This is simply not true at all. Mainly build cache systems like sccache work by wrapping rustc, there is no reason why using a build cache should be forbidden.

(Through currently rust support of sccache is still not perfect and there are some limitations, doesn't change that the rules are to broad.)

I also wouldn't be surprised if there are some rustc flags not exposed in cargo profiles which allow trigger some security mechanisms in llvm which are not enabled by default but beneficial for your project.

Like always the important think is that you understand what you do and want implications it had.

j / k navigate · click thread line to collapse

58 comments

40 comments · 5 top-level

CJefferson6y ago· 21 in thread

I personally strongly disagree with:

Functions or instructions that can cause the code to panic at runtime must not be used.

First of all, they kind of dodge this later by saying "Array indexing must be properly tested" (else it can panic) -- everyone thinks they write code which is "properly tested".

Also, in rust if we don't want to panic we need to never use array indexing and never use integer division, just to start.

Of course, it's fine to write code which plans to never panic, but I think it would be better to say "Have a formal 'panic plan'", where either one panics early, or tries to never panic.

mbrubeck6y ago

(See also "crash-only software," from the Erlang school of reliability engineering.)

cesarb6y ago

mbrubeck6y ago

Unwind safety is a real issue. I have some personal experience with fixing panic-safety issues in unsafe Rust code:

https://github.com/servo/rust-smallvec/pull/103

I wrote a bit more about it here:

https://users.rust-lang.org/t/c-pitfalls-hard-to-avoid-that-...

1 more reply

baby6y ago

> much easier to find through fuzzing

Note that you can fuzz with debug asserts enabled.

eeZah7Ux6y ago

> Panicking is, in many ways, the best-case scenario for code that contains a bug.

The very opposite: it's among the worst behaviors, short of creating a vulnerability.

From any modern languages we should expect an exception to be raised.

mbrubeck6y ago

https://users.rust-lang.org/t/c-pitfalls-hard-to-avoid-that-...

1 more reply

masklinn6y ago

> First of all, they kind of dodge this later by saying "Array indexing must be properly tested" (else it can panic) -- everyone thinks they write code which is "properly tested".

You're skipping half the recommendation though:

> Array indexing must be properly tested, or the get method should be used to return an Option.

emphasis mine

> Also, in rust if we don't want to panic we need to never use array indexing and never use integer division, just to start.

I mean, that's literally the block above the one you quote:

> Common patterns that can cause panics are:

> * using unwrap or expect,

> * using assert,

> * an unchecked access to an array,

> * integer overflow (in debug mode),

> * division by zero,

> * large allocations,

> * string formatting using format!.

>> Rule LANG-NOPANIC

>> Functions or instructions that can cause the code to panic at runtime must not be used.

The bit you quote is really a additional reminder that array indexing is not panic-safe.

CJefferson6y ago

masklinn6y ago

> Saying "properly tested" doesn't seem useful to me. I'm sure people thing most array overflows in C libraries are "properly tested"

heavenlyblue6y ago

What do you do when your algorithm works with array indexing and you are never supposed to handle than None within the Option?

Ie if you have indexed array out of bounds that’s a developer’s mistake and not a “condition to be handled up the stack”.

It’s ok to panic if you agree that it’s better to panic rather than end up in a state that will probably cause even more issues down the line

masklinn6y ago

> What do you do when your algorithm works with array indexing and you are never supposed to handle than None within the Option?

That's where the first clause comes into play:

> Array indexing must be properly tested

Although

> Ie if you have indexed array out of bounds that’s a developer’s mistake

Which developer would be my question.

> should never use functions or instructions that can fail and cause the code to panic

which, assuming the word is used in the RFC 2119 sense means there can be case where calling panic-ing functions is the right thing to do.

SkyMarshal6y ago

You're skipping half the recommendation though:

> Array indexing must be properly tested, or the get method should be used to return an Option.

http://learnyouahaskell.com/a-fistful-of-monads#getting-our-...

https://en.wikibooks.org/wiki/Haskell/Understanding_monads/M...

3 more replies

baby6y ago

There’s a number of functions that panic in the std if misused, for example copy_from_slice

masklinn6y ago

Yes?

Paul-ish6y ago

dathinab6y ago

I can only agree diesel had (maybe still has) a serve bug around connection reuse and panics.

Sure panics across FFI boundaries can't be done nice at all, but this can still be handled by making sure there is no FFI boundary insider of an error kernel, just in between.

a13692099936y ago

You don't need unwind safety (for anything, ever); a error kennel is more commonly known as a "process"; when a panic occurs

> you always kill the whole [process] in which the panic occurred and don't share mutable state between [process]s

Processes are designed for this, including making it very hard to accidentaly share mutable state (using eg mmap(MAP_SHARED) explicitly).

1 more reply

baby6y ago

No，if you are writing secure code then a panic is a denial of service. Their point is that you should never panic, unless you do reach an unrecoverable error.

Paul-ish6y ago

baby6y ago

Yeah that's what is lacking from my comment: expect panics to happen and attempt to gracefully recover whenever that happens. This is not easy though.

jerf6y ago

(This is neither direct disagreement nor agreement, but a sort of elaboration of my own.)

gardaani6y ago· 12 in thread

I'm having problems fulfilling this requirement in my libs: "Crates providing libraries should never use functions or instructions that can fail and cause the code to panic."

The Rust standard library Vec, HashMap etc. can cause a panic in Rust, if the device (such as a mobile phone with a small memory) runs out of memory.

C and C++ standard libraries (malloc, std::vector, std::map..) can handle those situations by returning null or throwing an exception.

_0w8t6y ago

https://www.youtube.com/watch?v=ARYP83yNAWk mentioned that the notion that C++ could handle out-of-memory was a purely theoretical wishful thinking. In practice it could not.

masklinn6y ago

Yes that is one of the primary failures of Rust at the moment: to my knowledge it currently has no good way to safely manage allocation failures (it also has serious issues with stack overflows).

This is an issue with all heap-allocating construct, not just collections but also Box or Rc.

[0] https://github.com/rust-lang/rust/issues/32838

steveklabnik6y ago

> Yes that is one of the primary failures of Rust at the moment: to my knowledge it currently has no good way to safely manage allocation failures

So, sort of yes and sort of no.

rectang6y ago

I think Rust got it right, here.

Dealing with allocation failure gracefully is hard and requires a lot of extra code.

cesarb6y ago

a13692099936y ago

gardaani6y ago

Actually, I have found FallibleVec written by Mozilla [1] but I haven't found anything for other containers, yet.

[1] https://github.com/mozilla/mp4parse_fallible

baby6y ago

Honestly all C programs that fail to allocate panic as well. What else can you do?

1 more reply

Cyph0n6y ago

Oh, I never even considered this... But how would Rust ever be able to signal allocation failures in such cases?

steveklabnik6y ago

Same as any other failure; functions that may try to allocate memory and fail will return a Result<T, E> instead of just a T.

swiftcoder6y ago

2 more replies

Sharlin6y ago

Eg. with Vec, every operation that may trigger reallocation should return Result.

imjasonmiller6y ago· 2 in thread

The context that I was missing at first:

I didn't know about "cargo-outdated". I liked having "npm outdated" within the JavaScript ecosystem, so I'll give this a try.

1. https://en.wikipedia.org/wiki/Agence_nationale_de_la_s%C3%A9...

jamwaffles6y ago

`cargo-outdated` is mint. If you haven't already, consider installing `cargo-edit` too! It adds `cargo add`/`cargo rm` to add/remove crates, as well as `cargo upgrade` to bump semver versions.

https://github.com/killercup/cargo-edit

masklinn6y ago

There's also cargo-crev which is pretty neat.

maeln6y ago

This is published by the ANSSI[1], National Cybersecurity Agency of France. They do quite a good job at publishing security guide.

[1] https://www.ssi.gouv.fr/en/

dathinab6y ago

I found at least one problematic section when scanning:

> The environment variables RUSTC, RUSTC_WRAPPER and RUSTFLAGS must not be overriden when using Cargo to build project.

This is simply not true at all. Mainly build cache systems like sccache work by wrapping rustc, there is no reason why using a build cache should be forbidden.

(Through currently rust support of sccache is still not perfect and there are some limitations, doesn't change that the rules are to broad.)

Like always the important think is that you understand what you do and want implications it had.

j / k navigate · click thread line to collapse