But then people realised that 99% of the time you just want to handle the error by passing it upwards, and so ? was invented.
But then people realised that this loses context of where the error occured, so now we're inventing call stacks.
So it seems that what people actually want is errors that by default get transferred to their caller and by default show the call stack where they occured. And we have a name for that...exceptions.
It seems that what we're converging towards is really not all that different from checked exceptions, just where the error type is an enum of possible errors (which can be non-exhaustive) instead of a list of possible exception types (which IIUC was the main problem with java's checked exceptions).
And exceptions don't have to be slower than putting errors in return values.
(Having said that, I am still not a proponent of exceptions for error handling.)
That’s not non-deterministic. It’s just not statistically typed.
With checked exceptions, it's very common for the user to end up with only a cryptic message from a leaf function deep inside something, and that's very hard to interpret.
Having a manual stack of meaningful messages that add context is so nice as a user. Even if I do get the stacktrace in a program that threw a deep exception, you typically won't understand anything as a user without access to the code, the stack trace for exceptions is just not meant for human consumption.
This is 100% a reason that I like using SNAFU. The term I use for this is a "semantic stack trace" — a lot of the time, the person experiencing the error doesn't care that it occurred in "foo.rs" or "fn bar()" or "line 123". Instead, they care what the program is trying to do ("open the configuration file", "download the update file").
When I'm putting effort into my errors, I basically never use `snafu::Location` or `snafu::Backtrace`. My error stacks should always be unique — any stack can exactly point to a trace through my program.
You've drawn the wrong conclusion - we don't want that by default. We want to chose. In most cases we'll just return the error to the caller, but we don't want it to be the default so we can miss critical points where we didn't want to do that.
Getting a stack trace isn't a distinguishing feature of exceptions; stack traces predate the notion of exceptions. The distinguishing feature of exceptions is that they're a parallel return path all the way back up to `main` that you can ignore if you don't care to handle the error, or intercept at any level if you do. For some contexts I think this is fine (scripting languages), and for other contexts I think that being forced to acknowledge errors in the main return path is preferable.
Suppose we had a version of the ? operator that automatically appended a call stack to the error value returned. Are you saying that that's not "an exception" because I still need to write ? after each falliable function? Or because it's still part of the return type? Or is it specifically only an exception if it works via stack unwinding?
Java's checked exceptions are the worst. Having to declare every exception thrown as a part of your API/ABI makes for brittle, difficult-to-evolve interfaces.
Rust's Result and '?' syntax sidesteps a few of these issues. You can "add" underlying errors to the error return of your function without changing its API/ABI. You don't need to add a bunch of try/catch blocks, cluttering and confusing the code, in order to make sense of this and convert exceptions into whatever your API/ABI specifies. Rust's 'From<>' trait is damn-near magical when it comes to error conversion and propagation.
I get that not everyone is a functional programming enthusiast, but you can't do FP with exceptions. (Well, you can, via a sort of Try monad like Scala has, but it's error-prone and ugly to deal with.) With Result, you can, and it works seamlessly with the rest of the language and syntax.
I don't think Rust's error model is perfect, but it's miles ahead of what I've worked with in most other languages.
> Java's checked exceptions are the worst. Having to declare every exception thrown as a part of your API/ABI makes for brittle, difficult-to-evolve interfaces.
How is this different, in practice, from how it's done in Rust? You have to evolve your Result error type as well. The exact same concerns exist for both. The difference is that you actually have more choice/freedom with Java: you can choose to wrap all of your API's checked exceptions under one base type (analogous to defining a single error type for Result in Rust) so your function throws a single exception type, or you can have your function signature use an ad-hoc union type of several exception types without the boilerplate of wrapping them in a new type. In fact, many people have requested ad-hoc union types in Rust for a long time, because it's so painful to choose between all of your functions returning the same umbrella error type even though it only truly needs a subset of it vs. defining new mostly-redundant error types for each function in your API.
> Rust's Result and '?' syntax sidesteps a few of these issues. You can "add" underlying errors to the error return of your function without changing its API/ABI. You don't need to add a bunch of try/catch blocks, cluttering and confusing the code, in order to make sense of this and convert exceptions into whatever your API/ABI specifies. Rust's 'From<>' trait is damn-near magical when it comes to error conversion and propagation.
As I mentioned above, you can certainly define a base exception type (and you probably should in many cases) in Java, too. Yes, Java's syntax is fairly verbose, but Java's syntax is verbose for almost all of the language. So, is it the checked exception mechanism that is "bad", or is it just that all of Java is verbose? My take is that checked exceptions are, overall, good, and the syntax to work with them in Java is similarly tedious as the rest of the language.
Also, as a tangent, I kind of hate `From<>` in Rust. I think people lean on it way too much. It certainly makes the code shorter and "cleaner", but it also makes it harder to understand because of how implicit it is. And it causes people to miss opportunities where they actually could or should handle an error, just because the types happen to line up so that you can use `?`, instead of thinking about the actual local logic.
> I get that not everyone is a functional programming enthusiast, but you can't do FP with exceptions. (Well, you can, via a sort of Try monad like Scala has, but it's error-prone and ugly to deal with.) With Result, you can, and it works seamlessly with the rest of the language and syntax.
Can you elaborate on this? I feel like Scala's Try and Either are almost exactly the same as Rust's Result.
Let’s not talk about panics, shall we?
IMO it just poisoned the well, and now everyone* thinks they don't like checked exceptions, when really they just don't like Java's badly crippled version.
interface Frobinicator<E extends Exception> {
void frobinicate() throws E;
}The lack of any kind of caller information when creating an error makes it quite important to write decent error messages, which I think is actually quite hard to do.
At the same time I think it depends on what you're building: a library should have good errors (ideally well-typed ones too), but in an application you'd benefit from adding logging at each point in the stack (which can then contain caller information like file and line number) rather than just doing the logging at a system boundary; maybe set it at debug level. Then use tracing for the rest of it (for extra visibility in stuff like Sentry).
At least, I feel like that's how you'd be encouraged to do it in Go considering the opinions of Go's creators.
Handling results with map, map_err and .ok is way easier to follow that the minimum 4 lines you have to add in Java to do anything about a checked exception (try {} catch {}).
Explicit error handling/ignoring/passing is way better than implicit, so the direction of checked exception is good.
The debate is not really checked exceptions vs Result, it's try/catch vs map_err (and friends). And will always chose the latter.
Phrased another way, Java's syntax is fairly verbose for everything, not just for try-catching to handle exceptions.
The relevant improvement new languages (Rust, Zig, Swift?) bring over old is making it explicit at the callsite what actions throw and how they're composed
This seems like a gross exaggeration
> So it seems that what people actually want is errors that by default get transferred to their caller
Hell no
First, I assert that Java's checked exceptions are a solidly good feature. Of course it has flaws. The whole rest of the language is also full of flaws, so that's not surprising.
Second, I assert that there are two things that have caused the vast majority of hate toward Java's checked exceptions: programmers not being taught/shown how and when they're intended to be used, and that oft-circulated interview transcript from 2003 where Anders Hejlsberg asserts that checked exceptions are language design "dead end". I don't think he was right in 2003, and I especially don't think the opinion is correct today in light of how much strong static typing has really gained favor with the programming community. But, that opinion really took off and we spent years and years seeing that assessment repeated as a truism, which I think is why it took so long to finally start experimenting with statically typed failure modes again (e.g., Rust and Swift).
Now, here's where I'll get controversial about Rust error handling. I'll try really hard to keep this from turning into an entire dissertation, but I'll elaborate if anyone asks.
It is often a mistake to implement the `From` trait for error types and use the `?` operator everywhere. Error types in an API need to be aware of the context in which they occur, so just converting by type only often doesn't make sense. You may encounter a `FooError` type while your app is doing totally different things, so it's likely that not every `FooError` occurrence means the same thing to whoever is calling into your code. Also, sometimes you can actually handle an error, and getting into the muscle memory habit of just tacking `?` on to everything can lead to mistakenly propagating errors that you might have better handled by doing something else (including perhaps panicking).
There does seem to be a trend toward automatically adding stack traces in Rust errors. This is completely misguided, IMO. And this may be my MOST controversial opinion: stack traces almost *never* belong in a `Result<>` error type. Result types should be relevant to your "domain" (borrowing the term from "Domain Driven Design" even though I do NOT advocate for DDD in general).
Think about it this way: designing an API is about abstraction. So if you write a integer division function that takes two arguments and divides them, it might return `Result<i64, DivideByZero>`. If the caller passes in a 0 divisor, then what business is it of theirs to see what your private functions are called, how many of them are called, and what line of your file they were defined on? That's the leakiest of leaky abstractions.
You might be thinking: "But, if I see an result/error value that I didn't expect while running my program, the stack trace will help me track down the issue!" Yeah, no kidding. So, let's also start adding stack traces to our successful values, too! After, all, if I call my division function and get back a `Result::Ok` with a weird number that I didn't expect, I might want to trace that back, too, right? (This suggestion is sarcastic to prove a point. It should, hopefully, sound ridiculous to add stack traces to every return value from every function.)
The issue is that Rust's Result (and Java's checked exceptions) require a different paradigm. A Result is in the type signature because it's part of your domain's API design. It's just values. It's not *for* debugging. You use a debugger for that or programmatically panic when something is truly unexpected and get the stack trace from that.
Which leads to the corollary to the previous controversial opinion: Rust has unchecked exceptions; they're called panics and they are 100% *okay to use* in the vast majority of applications that the vast majority of day-job programmers work on.
Obviously, context matters, and there are some places where panicking is unacceptable. But, Result is for expected domain failures. Panics are for programmer errors and unrecoverable constraint violations. And I'm not advocating for panics to be "lazy". Rust code that refuses to ever panic (as far as they know, but I hope they aren't indexing any vecs/arrays just in case!) usually leads to overly polluted error types where it ends up being difficult to understand what errors are actually meaningful and what errors are never actually going to happen. Instead of inspecting errors and figuring out which to handle and how, I've seen things just snowball into a giant mess of nested enums with sometimes redundant error "branches" and missed opportunities to actually handle some cases. If you, as the programmer, know for sure that you just added something to a HashMap earlier in your function and you know you didn't remove it, then for the love of all things sacred, just write `map.get("my-key").unwrap()` (or `.expect("message")`--whatever) instead of making the caller have to consider an error that will never happen, is not their fault, and that they can't do anything about!
And, if you do have a situation where panicking is unacceptable (you must be using `#![no_std]`, right??), then don't make a bunch of different error types for all of the possible programmer bugs. Just make a single umbrella `FatalError` type and use that.
For further reading, I really like this piece from the book Real World OCaml, which also has a Result type and exceptions: https://dev.realworldocaml.org/error-handling.html. Specifically, the very last section at the bottom of the page, titled: "Choosing an Error-Handling Strategy". (The old version of that page used to be more plain HTML and the sections had anchors so I could link directly to that section...)
And for further reading about error handling strategy in a no-panic context, I really like the approach described here: https://sled.rs/errors
The problem is that "unrecoverable constraint violations" happen a lot in practice when you're dealing with filesystems, networking...anything that isn't pure computation.
Suppose I have a function that calls other functions that themselves make 3 database queries, two HTTP requests, and reads/writes from a cache directory. It considers all of them (except perhaps the caching) unrecoverable in the context of that function. What should it do?
I see three reasonable options:
(1). return a simple error type saying "Networking failure", "IO Error", etc if any of those fail
(2). return a complex error type that exposes the internal details of all the different things it's doing and which one failed and why
(3). panic if any of them fail
I would argue that (1) is unfit for purpose as you have no idea what's actually going wrong.
And (3) is currently very heavily discouraged, though I think if I'm understanding your argument right it probably makes the most sense. However it leaves your top-level function in the awkward position of needing to make that panic part of its API contract, without the type system to help. It's also highly limiting because the caller now can't distinugish between programmer errors and possibly-transient environmental conditions like a service outage.
(2) is what I'd expect to see in practice right now, and that's what leads to these automatic stack traces, etc. But none of these feel like good options. Ideally I'd want something that is:
- Debuggable (like (2) and (3))
- Part of the type system (like (1) and (2))
- Still allows introspection by the caller (like (1) and (2))
- Doesn't require a ton of boilerplate at each level (like (3), and possibly (1))
(edited for formatting)
I agree in theory, but I think they're very poorly implemented, and the syntax and tooling around handling them is terrible. And, frankly, those flaws (yes, I agree everything has flaws) make the overall feature mostly useless, unfortunately. It really doesn't matter where you think all the hate comes from; the hate is there, and it means that very few people use checked exceptions, except for where they're required to when stdlib methods throw them. Ultimately that's all that matters. If no one uses the feature, then it's not a useful feature, regardless of the reasons.
> The issue is that Rust's Result (and Java's checked exceptions) require a different paradigm. A Result is in the type signature because it's part of your domain's API design.
Correct, but in Java, checked exceptions are also a part of the API and ABI, so there's really little difference there, outside of ergonomics. (Which IMO are one of the most important parts!)
> (This suggestion is sarcastic to prove a point. It should, hopefully, sound ridiculous to add stack traces to every return value from every function.)
I don't think that proves a point. Sure, you can argue every proposal into absurdity; it doesn't make the suggestion itself bad.
> Rust has unchecked exceptions; they're called panics and they are 100% okay to use* in the vast majority of applications that the vast majority of day-job programmers work on.*
Yes, and this really bothers me. I wish more people would annotate their functions with `#[no_panic]`. Actually, I wish that was the default, and if you want to write a function that panics or calls functions that can panic, you need to annotate the function with `#[can_panic]`, and the compiler should enforce that, and `rustdoc` should surface that in all documentation.
I don't think I disagree with the ends you're proposing (don't add stack traces to every value, don't add stack traces specifically to Result::Err(E) variants); however, this is a bad way to justify it. Tools like dtrace / bpftrace do exactly this kind of stack tracing for both success and error cases across entire systems. This is a good thing™, and is actually very useful for both debugging, performance profiling, and understanding what your code is really doing on the hardware.
So I guess I disagree with how you're framing it. I would argue that adding stack traces to every value in Rust would be bad because it is a lot of overhead for something your kernel can and will do better.
The issue is that Rust's Result (and Java's checked exceptions) require a different paradigm. A Result is in the type signature because it's part of your domain's API design. It's just values. It's not for* debugging. You use a debugger for that or programmatically panic when something is truly unexpected and get the stack trace from that.*
This really is the gist of it. However, I will say that in my experience the reason that Result types are nice (over e.g. exceptions) is that putting the error cases in the type contract means that you can have the compiler check when someone hasn't handled an error case (? and unwrap are "handling" it even if they may not always be appropriate), as well as statically verify which variants may be unused. One very frustrating thing I've had to encounter in C++ is finding a whole list of different errors that have been duplicated as multiple different opaque (e.g. behind a unique_ptr<std::exception> or some such) exceptions across the codebase.
Being able to know what variants of error can come out of an API is great! It just happens that working with a rich type system like Rust makes it possible to do all manner of things that languages-with-only-exceptions cannot.
for example me, Yes my code can fail and only have 1 type eg: AppError
but I can supplement that with db error,cache error,serde error etc
You'll certainly need it if you want to have human readable source code locations, but doesn't it work with addresses only? Can't you split off the debug symbols and then use `addr2line` to resolve source code locations when you get error messages from end users running release builds?
Additionally, Rust has absurdly overly precise debug info.
Even set to minimum detail, it's still huge, and still keeps all of the layers of those "zero-cost" abstractions that were removed from the executable, so every `for` loop and every arithmetic operation has layers upon layers of debug junk.
External debug info is also more fragile. It's chronically broken on macOS (Rust doesn't test it with Apple's tools). On Linux, it often needs to use GNU debuginfo and be placed in system-wide directories to work reliably.
Typically the memory map is only required when capturing the backtrace and when outputting the stack frames' addresses relative the the binary file sections are given/stored/printed (with the load time address subtracted). E.g. SysRq+l on Linux. This occurs at runtime so saving the memory map is not necessary in addition to the relative addresses.
Not sure if this is viable on all the platforms that Rust supports.
> but for some reason Rust's standard library wants to resolve human-readable paths at runtime.
Ah, I see that Rust's `std::backtrace::Backtrace` is missing any API to extract address information and it does not print the address infos either. Even with the `backtrace_frames` feature you only get a list of frames but no useful info can be extracted.
Hopefully this gets improved soon.
> External debug info is also more fragile.
I use external debug info all the time because uploading binaries with debug symbols to the (embedded) devices I run the code on is prohibitively expensive. It needs some extra steps in debugging but in general it seems to work reliably at least on the platforms I work with. The debugger client runs on my local computer with the debug symbols on disk and the code runs under a remote debugger on the device.
I'm sure there are flaky platforms that are not as reliable.
That's solvable though. The bigger problem is how you unwind the stack. the stack is not generally unwindable, unless you're the compiler. Debug symbols include information from the compiler about the stack sizes and shapes to help backtrace with unwinding the stack. It's quite possible to include such symbols in the final binary without adding debug symbols, a lot of compilers just don't have a specification for that.
Source: I work on a profiler (Parca) that does stack unwinding. It works fine on Rust binaries with or without debug symbols.
The addresses you typically see in a backtrace error message (with debug syms disabled) are relative to the sections in the binary file, the runtime address it was loaded at has already been taken into account and subtracted. At least that's how you typically see a backtrace address in a typical native app on Linux.
> The bigger problem is how you unwind the stack.
Rust can unwind the stack on panic when built without debug symbols.
As someone who works extensively in cpp/java/python. I want so much to love rust, but unfortunately I haven’t found it to be productive after 6+ side projects.
But it's still somewhat young, lots of stuff is being built. So some of the lack of productivity probably just comes from not knowing the right stacks yet.
This is true only if you add #[from] attribute to a variant. Implementing std::convert::From is completely optional. Personally I don't prefer it too as it ambiguates the context. I only use it for "trivially" wrapped errors like eyre::Report.
enum CarWontMove {
EngineTroubles(EngineTroubles),
WheelsFellOff(WheelsFellOff),
}
Even then, there’s often some additional context you can affix at that higher level.[0]: https://docs.rs/snafu/latest/snafu/derive.Snafu.html#disabli...
[1]: https://docs.rs/snafu/latest/snafu/derive.Snafu.html#delegat...
I’m glad to see SNAFU was useful to others!
* can it be used as a build dependency (i.e symbols from the snafu crate don't appear in the generated code).
* I assume you have to use one of the macros (ensure! or location!) when constructing an error that contains a location?
You don't have to use the macros, no. When you define your error type, you can mark a field as `#[snafu(implicit)]` [2]. When the error is generated, that field will be implicitly generated via a trait method. The two types this is available for are backtraces and locations, but you could create your own implementations such as grabbing the current timestamp or a HTTP request ID.
[0]: https://doc.rust-lang.org/cargo/reference/specifying-depende...
[1]: There's one tiny leak I'm aware of, which is that your error type will implement the `snafu::ErrorCompat` trait, which is just a light polyfill for some features not present on the standard library's `Error` trait. It's a slow-burn goal to remove this at some point, likely when the error "provider API" stabilizes.
[2]: https://docs.rs/snafu/latest/snafu/derive.Snafu.html#control...
struct SpanTraceWrapper(tracing_error::SpanTrace);
impl snafu::GenerateImplicitData for SpanTraceWrapper {
fn generate() -> Self {
Self(tracing_error::SpanTrace::capture())
}
}
And then you can use it as #[derive(Debug, Snafu)]
struct SomeError {
#[snafu(implicit)]
span_trace: SpanTraceWrapper,
}
This will capture the `SpanTrace` whenever `SomeError` is constructed (e.g. `thing().context(SomeSnafu)` or `SomeSnafu.fail()`.You can absolutely have two different enum variants from the same source type. It would look something like:
#[derive(Debug, Error)]
pub(crate) enum MyErrorType {
#[error("failed to create staging directory at {}", path.display())]
CreateStagingDirectory{
source: std::io::Error,
path: std::path::PathBuf,
},
#[error("failed to copy files to staging directory")]
CopyFiles{
source: std::io::Error,
}
}
This does mean that you need to manually specify which error variant you are returning rather than just using ?: create_dir(path).map_err(|err| MyErrorType::CreateStagingDirectory {
source: err, path: path.clone()
})?;
but I would argue that that is the entire point of defining a specific error type. If you don't care about the context and only that an io::Error occurred, then just return that directly or use a type-erased error. create_dir(path).context(CreateStagingDirectorySnafu { path })?;
Note a few points:1. No need to use the closure
2. No need to carry the source error over yourself (`context` does this for you)
3. No need to explicitly call `clone` on the path (`context` does this for you)
Surely that's comparing full debuginfo, right? Backtraces just need symbols, not full debuginfo, and there's no way the symbols are 4x the size of the binary.
async fn handle_request(req: Request) -> Result<Output> {
let msg = decode_msg(&req.msg).context(DecodeMessage)?; // propagate error with new stack and context
verify_msg(&msg)?; // pass error to the caller directly
process_msg(msg).await? // pass error to the caller directly
}
async fn decode_msg(msg: &RawMessage) -> Result<Message> {
serde_json::from_slice(&msg).context(SerdeJson) // propagate error with new stack and context
}
how to capture the virtual stack when `verify_msg` returns an error? Do you have some lint to make sure every error is attached with a context?The time I can think this won't work is when you are reusing error types across places. Recently, I've been experimenting with creating a lot of error types, so far as one unique error type per function. I haven't done this for long enough to have a real report, but I haven't hated it so far.
#[macro_export]
macro_rules! h {
() => {
concat!("at ", file!(), " line ", line!(), " column ", column!())
};
and then using anyhow::Result.Solves 99% problems in error handling