When invariant violations or mistakes by programmers (aka bugs) are detected, the program should halt as it is in an inconsistent state and continuing could be very dangerous (think privacy/security/data corruption). Otherwise, don't halt (handle it or have the caller handle it).
But the same kinds of failures might not be reasonably expected in other circumstances--I wouldn't expect that the internal configuration files of an application should occur in reasonable operation, and therefore it makes sense to panic if they're corrupted... even if the cause is an I/O operation on a local disk, or parsing some JSON or TOML or INI or whatnot file.
One implication of this is that it needs to be easy for any error system to promote an "expected error" into an "unexpected error"--which is what unwrap/expect does. The recoverable/unrecoverable error suggests that there ought to be no reason to do this, but there is absolutely a reason to do so: what category an error falls into is ultimately decided by the context of the error, not the generation of the error itself.
There are unexpected errors for sure. For example "StackOverflowError" which could be thrown from any method call.
There should be unexpected errors handler which does some sane thing in given circumstances. Usually it involves logging error and its details (stacktrace, may be something else) and returning some kind of generic error to the caller (e.g. HTTP 500).
But the thing is, this handler very often suitable for handling errors that I expect but I don't want to bother handling them. For example those errors might be rare enough that writing code would be a net negative (every line of code is maintenance burden and error handling code is maintenance burden power 2). And I'm totally OK with those errors being handled in generic way.
Even if it's user input sometimes. For example user could be me. And I know input format. I don't want to write error handling for myself, all I need is to prevent data corruption. Getting HTTP 500 InvalidNumberFormatException is totally fine in some situations.
And language should provide means for writing that kind of code. At least that's my opinion and that's what I truly miss in those languages with explicit error handling of every function call.
You might call it lazy coding. I call it reasonable coding.
Exceptions for the win!
The way I've often heard this phrased is "exceptions are for exceptional behavior", and it's always rubbed me the wrong way a bit (although maybe this is partially just because I don't think wordplay is a sufficient argument to do something; I've made similar arguments in the past to friends who sung the praises of "no-shave November" and "thirsty Thursdays"). From digging a bit deeper when I've heard this opinion espoused, it seems like it mostly boils down to the fact that exceptions tend not to be as efficient as happy-path code, so using them for circumstances that are too common is not going to lead to good performance. I guess I don't really find this a subtle enough concept to warrant needing to introduce another abstraction layer into the discussion, especially one that's much vaguer like "is this behavior expected?" If there's a performance concern, I think it's much better addressed directly rather than shifting the discussion to a proxy.
It makes some sense, but often, code should do relatively little and process should do most of the handling. Recovering from I/O errors and properly testing the recovery code can take huge amounts of time and effort.
Often, aborting the program and rerunning it or even restarting a service is the best way to handle these because properly handling them in code costs time better spent on other things. Logging a decent error message and aborting may be the better choice.
But of course, that depends on the use case. Databases need lots of recovery code for I/O errors and deadlock recovery, for example, even for cases that occur maybe once every year.
(And yes, process nowadays is often automated, making it code again, but IMO, that’s a different kind of code)
Whether an error returned from a fn is expected or unexpected has to be a property of the fn signature and language conventions in isolation. If semantic or other locally-unknowable details influence fault classification and error control flow at call sites, your program program becomes unmaintainable over time.
So like if you have fn parse_config that takes a string, or a file descriptor, or whatever -- it should return a Result<Config, Error> and yield an error for any un-parseable input. There is no reason this fn should ever panic.
IMO -- call stacks are sacrosanct, making control flow visible and obvious is one of if not the most important thing to optimize in nontrivial programming contexts.
By contrast, you should give up parsing a JSON if it's a config file read on startup but probably not if it's user input.
Well it's not always the case. There are situations in which if you detect errors you want the program to continue running, and have only that particular functionality to fail.
I tend to write resilient code, since I work in embedded systems and what you never want is the system to crash. Halting a CPU on an invariant violation (i.e. and assert failing) is something useful for debugging (you trigger the debugger and you then analyze why it happened), but something you generally don't want in production.
Bette to have a ton of checks more and in case of an invariant violation (that maybe is resulting from a programmer mistake, but there is always the possibility of hardware memory corruption errors) to return an error and handle it in some ways (for example restart the task that returned the error, trying to go back to the last working state).
Yes, like a web server. If a request handler fails by panicking, in a Rust program, you catch the panic, respond with a 500 error and log the panic somewhere. But you continue serving other requests.
I talked about this in the blog post.
The problem with your strategy is that it requires you to be aware of your own mistakes. That doesn't sound like a robust strategy, unless you're investing huge resources into sophisticated tooling and have drastically restricted the expressivity of your programming environment. That exists and is fine, and I even addressed that in the blog post too.
That might be okay too depending on what your system is. I've had a cell phone (the monochrome dumb kind) display an assertion error at me, that was kind of cool. A screw was coming loose, so there were hardware issues. Worst case, I'd have to bring it to the store and get it repaired or replaced; there was no threat of injury or large monetary damage.
- unactionable invariant violation (poisoned mutex, hard memory errors): crash immediately, something that should ever happen happened and there's no way to either handle or present the error to the user in a meaningful way.
- unactionable (at the call site) but normal errors (couldn't open a file, disconnected from the remote end of a connection, etc): these need to be propagated up to where they can be turned into actionable information for a user, ideally. This is rarely a thing the call site where it happened can usefully do.
- immediately actionable and normal errors (user input didn't validate, file user wanted to open doesn't exist, connection failed but can be retried with a backoff, etc). These need to be handled at the call site or maybe one or two levels up.
You need an exception-like mechanism (or at least a process for emulating one, a la go MRV or C errno) to handle the second case, you often want it for the third case, but it never really makes sense to use it for the first.
That said, I think in non-test rust code you should use expect instead of unwrap, because sometimes invariants do trip and that little tiny extra bit of info can make a huge difference to resolving it.
It is very clearly a choice, even if many people are deciding by default. By not tackling an issue, you've chosen to have that issue.
I do think that APIs that "overpromise" by not returning the errors they do not handle to the caller, and instead halt or throw an exception, do their users a disservice in the long-run. These just become undocumented cases that bite you later on. Better libraries have all these conditions baked into the API itself.
Also if using a library I don't want a bug in it to bring my program down, then I am forced to use workarounds like create a child process to use the library, start the child process from the main process and check on it to see if it fails or succeeds, that would be bad for performance and ugly.
There's either I/O errors - or there's logic errors. A failure with logic should nuke due to the app being in an inconsistent state; trust is lost. An I/O error should fail softly.
nah. in GUI apps for instance you want the failure in the logic of a sub-sub-function to just tell the error "wops" when the button that triggered the action was clicked, not nuke the app (unless you hate your users). e.g. imagine a 3D software which allows to do mesh operations - user clicks on the "Smooth the mesh" button somewhere. Programmer forgot to handle a division by zero in some degenerate case of the smoothing computation which ends up leading to an exception: a value becomes zero, someone used unsigned integers for n in an "n - 1" computation which ends up in a call to array_of_floats.resize(0xffffffffffffffff) (and a likely std::bad_alloc being thrown if you're in c++).
The original mesh is unchanged as the operation waits until the computation is complete to replace the old mesh with the new.
If you ever decide to crash in this situation I am sure you will have great reviews on 3D modeling software comparisons.
This works for me really well to get a prototype going, but then have a solid program later. Because the unwrap/expect is so easy to search for, that you can really postpone the error handling until later.
Anyway i wouldn't say "plenty", but i did came across crates (parsers :/) that would unwrap on malformed input. the workaround is to encapsulate their use in a catch_unwind.
for the record, i had a similar issue in a c++ lib where the author elected to abort on the unsupported input, so i'm somewhat thankful that the idiomatic mechanism is panic (which is recoverable if needs be) in Rust
.map_err(|_| unimplemented!()).unwrap()
or something. (I hope you get the gist of that, as I'm 99% sure that code doesn't compile. Error(E) -> !, Ok(T) -> T) result.unwrap_or_else(|| todo !())
(unimplemented! is for missing functionality in a given version, todo! is for missing functionality during development)But using unwrap for these kinds of TODO is ok, code review won't let unjustified unwraps to pass
I don't find myself in that position too frequently. Certainly not enough to warrant two identical but differently named functions.
In that case, I would suggest coming up with your own pattern. Perhaps an unwrap() with a FIXME comment. Or a expect("FIXME").
It would have been fairly trivial to set up generic constraints specifying if a read, write, or read-write ordering semantic is expected and to fail to compile if it wasn’t met.
It doesn’t need const at all, though. You just need three traits and either first class enum variants as types or a pseudo enum (mod/struct Ordering with ZST structs Acquire, Release, etc)
use std::sync::atomic::*;
let x = AtomicU64::new(0);
let ordering = if rand::random() {
Ordering::Relaxed
} else {
Ordering::SeqCst
};
x.fetch_add(1, ordering);The `.expect()` for regex for example would say what the regex is matching for.
I think it'd be desirable to have a `.unwrap_with_context("Context: {}")`, and the you'd get `Context: Inner Panic Info`.
What 'expect()' message would you write for this regex? https://github.com/BurntSushi/ucd-generate/blob/6d3aae3b8005...
I think 'unwrap()' there is perfectly appropriate.
> I think it'd be desirable to have a `.unwrap_with_context("Context: {}")`, and the you'd get `Context: Inner Panic Info`.
Why?
With this, even with out backtrace, you can work out what happened.
Without it, you just know that some regex somewhere is invalid.
impl<T, E: std::fmt::Debug> Result<T, E> {
pub fn unwrap(self) -> T {
match self {
Ok(t) => t,
Err(e) => panic!("called `Result::unwrap()` on an `Err` value: {:?}", e),
}
}
}(And, conversely, that it's not fine to use it to avoid doing real error handling)
People advocate for this. After publishing this article, it almost seems like people are more confused than I thought. Idk.
Anyway, no, this blog is not meant to be controversial. It is meant to untangle knots.
But the language has made it too easy to unwrap/expect and panic.
This escape hatch should exist, for sure. But it ought to be more explicit, and more of a pain.
It is far too easy to reach for this tool.
I ask because I suspect you've got a bit of motte and bailey going on here. The motte is "hey let's make unwrap/expect more verbose because we want people to be REALLY sure," but the bailey is "let's actually make everything that can panic a lot more verbose and totally change the character of the language and make it a lot less practical."
I'd encourage you to read the "lint" section near the end: https://blog.burntsushi.net/unwrap/#should-we-lint-against-u...
IMHO there's a qualitative difference in the programmer's expectations when indexing a slice vs calling a function which has been explicitly written to return an error condition. Years of convention causes us to expect indexing errors and to write defensively. But the implied contract of a Rust function with a Result is that the user should do probably do something with the result other than panic in most cases.
I agree that panicing is a legit option. And I agree with the scenarios laid out in the article. And I also don't think lint is the right way to handle it.
But I'm currently in a codebase that is full of unwraps all over -- which the developers did for expediency "get this thing shipped" reasons -- and that (and other codebases I've seen) is what leads me to the conclusion that the ergonomics of putting unwrap right out there in our faces aren't ideal.
Hell, even calling it "result_or_panic" would have perhaps made casual users of it pause and think about what they were doing. There are likely syntactical tools that could have been put in place to really make the user think before creating a panic.
(FWIW safety isn't my primary reason for preferring Rust. I'd be fine with "C++ with a ML-style type system." The general tamping down of footguns is great, though)
It absolutely is. Modern language design should be discouraging getting an element by index; there are usually better alternatives e.g. iterating through a datastructure, or using combinators like zip to build the datastructure/view you need.
// use std::env;
env::set_var("RUST_BACKTRACE", "full");What would you recommend? Forking and execing yourself to set the env var? backtrace::enable_full() or something to that effect would be nice.
If there was some sort of signal to mark a function as panicking (& vice versa), that would be nice.
This reads a little like flamebait.
Memory unsafety is bad. panic is _not_ a memory unsafe operation, it does not result in memory problems.
This post is not related to memory safety in any way.
Rust is a minefield of bear traps laid by experts, and I fear for the future of our industry if Go and Java programmers are required by some quirk of network effects or first mover advantage or whatever to starting programming (badly) in Rust.
> `assert!(!xs.is_empty(), "expected parameter 'xs' to be non-empty")`.
This panics with
> thread 'main' panicked at 'expected parameter 'xs' to be non-empty', src/main.rs:79:5
Without the custom message it's
> thread 'main' panicked at 'assertion failed: !xs.is_empty()', src/main.rs:79:5
Given that panics should be for bugs, i.e. interpreted by developers, I'd say the second message is clear enough and a custom message just adds noise in the source code.
> What do I mean by “API simplicity?” Well, this panic could be removed by moving this runtime invariant to a compile time invariant. Namely, the API could provide, for example, an AhoCorasickOverlapping type, and the overlapping search routines would be defined only on that type and not on AhoCorasick. Therefore, users of the crate could never call an overlapping search routine on an improperly configured automaton. The compiler simply wouldn’t allow it.
> But this adds a lot of additional surface area to the API. And it does it in really pernicious ways. For example, an AhoCorasickOverlapping type would still want to have normal non-overlapping search routines, just like AhoCorasick does. It’s now reasonable to want to be able to write routines that accept any kind of Aho-Corasick automaton and run a non-overlapping search. In that case, either the aho-corasick crate or the programmer using the crate needs to define some kind of generic abstraction to enable that. Or, more likely, perhaps copy some code.
> I thus made a judgment that having one type that can do everything—but might fail loudly for certain methods under certain configurations—would be best. The API design of aho-corasick isn’t going to result in subtle logic errors that silently produce incorrect results. If a mistake is made, then the caller is still going to get a panic with a clear message. At that point, the fix will be easy.
What I gather from this is that the author chose to define a type (call it A) with an attribute that when set in a certain way will cause certain functions to panic. This was preferred to the alternative (two types, A and B) with functions specific to each and where panic was not possible.
This kind of design decision comes up a lot, so understanding the reasoning here could be helpful in a lot of situations. Unfortunately, the passage is less than clear due to lack of source code inline and the highly-specific nature of the problem. An example with source code using more accessible algorithms might be an improvement here.
That said, I'm skeptical that the full range of approach was considered. I sometimes find that the presence of unwrap is a smell pointing to types that have not been fully fleshed out.
As an extreme case, consider a struct whose fields contained diverse data (numbers, colors, enumerated values), but which are all defined as strings. It will be very easy to put this struct into an inconsistent runtime state because nothing can be checked at compile time. The type itself is anemic. Replacing strings with more constrained types eliminates opportunities for panic - possibly all of them.
I get that the whole point is "at what cost?" All I'm saying is that the tradeoffs aren't clear from the example in the passage.
An Aho-Corasick automaton can be built in a few different ways. How it's built changes how matches are reported: https://docs.rs/aho-corasick/latest/aho_corasick/enum.MatchK...
It also turns out that some match kinds are more amenable to other types of searches, such as overlapping searches. Overlapping searches report every possible match, but leftmost "match kinds" specifically prune certain matches from the automaton. An overlapping search with a "leftmost" match kind produces weird results that are difficult to characterize.
So, when an automaton is configured with a "leftmost" match kind, you have a choice: allow overlapping searches or disallow them. I chose to disallow them. Once you make that choice, you then must choose whether to disallow them at compile time or disallow them at runtime. I chose runtime, for the reasons stated.
If I chose compile time, then I'd need a new `AhoCorasickOverlapping` type which provides the overlapping search routines in addition to the non-overlapping search routines. Then I could get rid of the overlapping search routines on `AhoCorasick`. I'd then also need to add a new build method[1] to the `AhoCorasickBuilder` that let you build an overlapping automaton.
> That said, I'm skeptical that the full range of approach was considered. I sometimes find that the presence of unwrap is a smell pointing to types that have not been fully fleshed out.
I am a fallible human. I might be wrong. So sure, be skeptical!
> As an extreme case, consider a struct whose fields contained diverse data (numbers, colors, enumerated values), but which are all defined as strings. It will be very easy to put this struct into an inconsistent runtime state because nothing can be checked at compile time. The type itself is anemic. Replacing strings with more constrained types eliminates opportunities for panic - possibly all of them.
I'm not sure what this is a case of? Like yeah, I agree, that sounds bad?
> I get that the whole point is "at what cost?" All I'm saying is that the tradeoffs aren't clear from the example in the passage.
Understood. Small illustrative examples are hard. Especially API design. API design is a very domain specific thing. I could probably write an entire blog post on the API design of just the aho-corasick crate. It has gone through many iterations and many lessons have been learned. (And I still have at least one more iteration to go.) I tried to distill down one small part of it in order to talk about the idea of not pursuing literally every possible compile time restriction because sometimes keeping invariants maintained at runtime leads to a simpler API. If you accept that principle already, then all is well.
But some people think the entire farm should be bet on pushing every possible thing to a compile time invariant, regardless of the cost. I am not one of those people and I think it leads to bad API design. And it's very relevant to this topic because if you don't push something to compile time, then, well, you probably need a panicking branch somewhere.
[1]: https://docs.rs/aho-corasick/latest/aho_corasick/struct.AhoC...
I think it is good to have some references (the blog post and the book) for when the horde comes after you.