> Its `SQLITE_INTERNAL` error code is atypical in my experience. In my experienc...

burntsushi · 2025-03-08T04:56:24 1741409784

I'd have to look more closely at those examples, but I find it hard to believe that every runtime invariant violation manifests as one of those error codes. It certainly isn't true for PCRE2.

> Rust brings safety by forcing (safe) code to use dynamic checks when a safety property cannot be statically guaranteed, which addresses (1). But there's still a degree of freedom for whether failures are reported as panics or as recoverable errors to the caller.

Sure, you can propagate an error. I just don't really see a compelling reason to do so. Like, maybe there are niche scenarios where maybe it's worthwhile, but I do not see how it would be compelling to suggest it as general practice.

You might point to C libraries doing the same, but I'd have to investigate what exactly those error codes are actually being used for and _why_ the C library maintainers added them. And the trade-offs in C land are totally different than in Rust. Those error codes might not exist if they had a panicking mechanism available to them.

> I wrote down some of my thinking in this recent blog entry, which actually quotes your excellent summary of when panics are appropriate: https://blog.reverberate.org/2025/02/03/no-panic-rust.html

Yes, I've read that. It's a nice blog, but I don't think it's broadly applicable. Like, I don't see why I would write no-panic-Rust outside of extremely niche scenarios. My blog on unwraps is meant to be more broadly applicable: https://burntsushi.net/unwrap/ (It even covers this case of trying to turn runtime invariant violations into error codes.)

hyc_symas · 2025-03-14T13:12:44 1741957964

> LMDB has MDB_PANIC, documented as "Update of meta page failed or environment had fatal error".

Yes. That doesn't mean there was anything bad in the program logic. It most likely means your storage device had a fatal I/O error. It means there's something physically wrong with your system. Not that there was any bug in any code.

burntsushi · 2025-03-08T13:31:50 1741440710

Now that I've slept, I decided to take a look at LMDB. It uses MDB_PANIC in exactly two places:

https://github.com/LMDB/lmdb/blob/f20e41de09d97e4461946b7e26...

I would say this overall does not even come close to qualifying as an example of a library that "returns errors for invariant violations instead of committing UB."

You don't have to look far to see something that would normally be a panicking branch in Rust be a UB branch in C: https://github.com/LMDB/lmdb/blob/f20e41de09d97e4461946b7e26...

    if (err >= MDB_KEYEXIST && err <= MDB_LAST_ERRCODE) {
      i = err - MDB_KEYEXIST;
      return mdb_errstr[i];
    }

That `mdb_errstr[i]` will have UB if `i` is out of bounds. And `i` could be out of bounds if this code gets out of sync with the defined error constants and `mdb_errstr`. Moreover, it seems quite unlikely that this particular part of the code benefits perf-wise from omitting bounds checks. In other words, if this were Rust code and someone used `unsafe` to opt out of bounds checks here (assuming they weren't already elided automatically), that would be a gross error in judgment IMO.

The kind of examples I'm asking for would be C libraries that catch these sorts of runtime invariants and propagate them up as errors.

Instead, at least for LMDB, MDB_PANIC isn't really used for this purpose.

Now looking at zlib, from what I can tell, Z_STREAM_ERROR is used to validate input arguments. It's not actually being used to detect runtime invariants. zlib is just like most any other C library as far as I can tell. There are UB branches everywhere. I'm sure some of those are important for perf, but I've spent 10 years working on optimizing low level libraries in Rust, and I can say for certain that the vast majority of them are not.

libavcodec is more of the same. There are a ton of runtime invariants everywhere that are just UB if they are broken. Again, this is not an example of a library eagerly checking for invariant violations and percolating up errors. From what I can see, AVERROR_BUG is used at various boundaries to detect some kinds of inconsistencies in the data.

IMO, your examples are a total misrepresentation of how C libraries typically work. From my review, my prior was totally confirmed: C libraries will happily do UB when runtime invariants are broken, where as Rust code tends to panic. Rust code will opt into the "UB when runtime invariants are broken," but it is far far more limited.

And this further demonstrates why "unsafe by default" is so bad.

haberman · 2025-03-08T14:33:11 1741444391

I think this is moving the goalposts.

My claim was not "these C libraries perfectly avoid UB by dynamically checking every invariant that could lead to UB if broken." Clearly they do not, as you have demonstrated. (Neither does unsafe Rust).

My claim was that in cases where a (low-level, high quality) C library does check an invariant in a release build, it will generally report failure of that invariant as an explicit error code rather than by crashing the process.

To falsify that, you would need to find places where these libraries call abort() or exit() in response to an internal inconsistency, in a release build. I think you are unlikely to find examples of that in these libraries. (After a bit of searching, I see that libavcodec has a few abort()s, but uses AVERROR_BUG an order of magnitude more often).

I agree with you that Rust's "safe by default" is important. I am advocating that Rust can be a powerful tool to provide C-like behavior (no crash on inconsistency) with greater safety (checking all relevant inconsistencies by default). In cases where C-like behavior is desired, that's a really appealing proposition.

Upthread it seemed like you were objecting to the idea of ever reporting internal inconsistencies as recoverable errors. You argued that creating and documenting error codes for this is not common or practical:

> I'd love to see the API docs for them. "This error value is impossible and this library will never return it. If it does, then there is a bug in the library. Since there are no known bugs related to this invariant violation, this cannot happen."

That is exactly what SQLITE_INTERNAL and AVERROR_BUG are.

burntsushi · 2025-03-08T15:03:54 1741446234

> My claim was that in cases where a (low-level, high quality) C library does check an invariant in a release build, it will generally report failure of that invariant as an explicit error code rather than by crashing the process.

That just seems very uninteresting though? And it kinda misses the whole point of where this conversation started. It's true that Rust code is going to check more things because of `unwrap()`, but that's a good thing! Because the alternative is clearly what C libraries practice: they'll just have UB. So you give up the possibility of an RCE for the possibility of a DoS. Sounds like a good trade to me.

>> I'd love to see the API docs for them. "This error value is impossible and this library will never return it. If it does, then there is a bug in the library. Since there are no known bugs related to this invariant violation, this cannot happen." > > That is exactly what SQLITE_INTERNAL and AVERROR_BUG are.

I meant that it should reflect the philosophy of handling broken runtime invariants generally in the library. Just because there's one error code for some restricted subset of cases doesn't mean that's how they deal with broken runtime invariants. In all of your examples so far, the vast majority of broken runtime variants from what I can see lead to UB, not error codes.

This is what I meant because this is what makes Rust and its panicking materially different from C. And it's relevant especially in contexts where people say, "well just return an error instead of panicking." But C libraries generally don't do that either! They don't even bother checking most runtime invariants anyway, even when it doesn't matter for perf.

This is a big knot to untangle and I'm sure my wording could have been more precise. This is why I wanted to focus on examples, because we can look at real world things. And from my perspective, the examples you've given do not embody the original advice that I was replying to:

> That behavior is up to the user. The library should only report the error.

Instead, while there is limited support for "this error is a bug," the C libraries you've linked overwhelming prefer UB. That's the relevant point of comparison. I'm not interested in trying to find C libraries that abort. I'm interested in a holistic comparison of actual practice and using that to contextualize the blanket suggestions given in this thread.

haberman · 2025-03-08T15:31:03 1741447863

> It's true that Rust code is going to check more things because of `unwrap()`, but that's a good thing! Because the alternative is clearly what C libraries practice: they'll just have UB.

I have been consistently advocating for a third alternative that I happen to like more than either of these.

My alternative is: write libraries in No-Panic Rust. That means we have all of the safety, but none of the crashes. It is consistent with the position articulated upthread:

> That behavior is up to the user. The library should only report the error.

No-Panic Rust means always using "?" instead of unwrap(). This doesn't give up any safety! It just reports errors in a different way. Unfortunately it does mean eschewing the standard library, which isn't generally programmed like this.

I won't argue that every library should use this strategy. It is undoubtedly much more work. But in some cases, that extra work might be justified. Isn't it nice that this possibility exists?

burntsushi · 2025-03-08T16:19:34 1741450774

We're back to square one: show me some real Rust libraries in widespread use actually adhering to this philosophy. And then I want to see some applications built with this philosophy. Then we can look at what the actual user experience difference is when a bug occurs. In one case, you get a panic with a stack trace. In the other, you get an error value that the application does... what with? Prints it as an unactionable error to end users and aborts? If it continues on, does your library make any guarantees about the consistency of its internal state when a runtime invariant is broken?

Panicking branches are everywhere in Rust. And even in your blog, you needed to use `unsafe` to avoid some of them. So I don't really get why you claim it is safer.

Users of my libraries would 100% be super annoyed by this. Imagine if `Regex::find` returned a `Result` purely because a bug might happen.

> But in some cases, that extra work might be justified. Isn't it nice that this possibility exists?

What I said above:

> Sure, you can propagate an error. I just don't really see a compelling reason to do so. Like, maybe there are niche scenarios where maybe it's worthwhile, but I do not see how it would be compelling to suggest it as general practice.

Your blog is an interesting technical exercise, but you spend comparatively little time on whether doing it is actually worth the trouble. And there is effectively no space at all reserved to how this impacts library API design. To be fair, you do acknowledge this:

> I should be clear that I have not yet attempted this technique at scale, so I cannot report on how well it works in practice. For now it is an exciting future direction for upb, and one that I hope will pay off.

From your blog, you list 3 reasons to do this: binary size, unrecoverability and runtime overhead.

I find that binary size is the only legitimate reason here, and for saving 300 KB, I would absolutely call that very niche. And especially so given that you can make panics abort to remove the code size overhead.

I find unrecoverability unconvincing because we are talking about bugs here. Panics are just one very convenient manifestation of a bug. But lots of bugs are silent and just make the output incorrect in some way. I just don't see a problem at all with bugs, generally, causing an abort with a useful error message.

I find runtime overhead very unconvincing because you can opt out of them on a case-by-case basis when perf demands it.

We can go around the maypole all day on this. But I want to see real examples following your philosophy. Because then I can poke and prod at it and point to what I think you're missing. Is the `upd` port publicly available?

whytevuhuni · 2025-03-08T16:32:43 1741451563

I'd like to add another point:

Both panics, and error-values for invariants, add a lot of branches in execution, for every invariant that is checked, and every indirect caller of functions that do it.

This means basically all function calls introduce new control flow at the call site, because they may either panic, or return an error value that the programmer will almost always immediately bubble up.

Such a large amount of new control flow is going to be impossible to reason about.

But!

Panics, and specifically catching them, as they are implemented in Rust, require that the wrapped code is UnwindSafe [1]. This is a trait that is automatically implemented for objects that remain in a good state despite panics. This automatically makes sure that if something unexpected does happen, whatever state was being modified, either remains in a mostly safe shape, or becomes unreadable and needs to be nuked and rebuilt.

This is massively useful for things like webservers, because simply catching panics (or exhaustive error values) is not enough to recover from them. You need to be able to ensure that no state has been left permanently damaged by the panic, and Rust's implementation of catch_unwind requiring things to be UnwindSafe is a lot better than normal error values.

[1]: https://doc.rust-lang.org/stable/std/panic/trait.UnwindSafe....

haberman · 2025-03-09T20:54:32 1741553672

I do not claim that No-Panic Rust is popular (or even used at all) in Rust libraries currently. If it was popular, I would not have had to think so hard about it and write a blog entry. I claim that this technique is widespread in C libraries, and I believe I have demonstrated that.

Our conversation was sidetracked because you claimed that panic and unwrap() were essential parts of how Rust provides safety, and that the C precedent doesn't apply because C's approach is unsafe. But I claim that No-Panic Rust is potentially a solution that gives you the best of both worlds: comparable safety without risk of a (detected) bug crashing the entire process. So I do think that the C precedent applies.

I grant that there are applications where panics are a perfectly reasonable way of handling internal errors. Your ripgrep is a perfect example: it's a short-lived process that only does one thing, and users are running it from a terminal (and are probably tech savvy) so they can easily copy and paste the crash into a bug report.

But there are lots of other applications that are not like this. Consider the Linux kernel, where a panic takes down your entire computer. Or consider a mobile (iOS or Android) application where there is no terminal to dump to, and the user experience of a crash is that the app closes unexpectedly and without explanation. Or consider a web browser where it would be very annoying for an entire tab or browser to crash just because one operation (like using the search box) ran into an internal error.

In most of these cases, you want to let the program continue if reasonably possible after an error is encountered, while also logging the error for later inspection/diagnosis and possibly telemetry. Probably you will be abandoning any internal state associated with the failing operation.

It's true that my blog uses unsafe in two cases to get rid of panics. The first is to call libc::printf(), but this is only required because the Rust stdlib does not offer any No-Panic API for printing to stdout. This is really just a symptom of the fact that No-Panic programming has little precedent in Rust. If there was a No-Panic variant of the standard library, it could offer a safe API for printing to stdout.

The second case is an optimization, where we are trying to remove a bounds check for performance reasons. This is an example of "opting out on a case-by-case basis", except what I am proposing is more principled and arguably safer than merely switching to get_unchecked(). I am asserting the underlying invariant of the data structure, and then letting the optimizer infer that the invariant implies that the bounds check is not necessary. I think this is pretty interesting, and very cool that the compiler is able to do this.

So overall I do argue that No-Panic Rust offers comparable safety to panics and unwrap().

The Rust port of upb is on the back burner currently, and nothing is open-sourced yet.

burntsushi · 2025-03-09T21:25:52 1741555552

> I do not claim that No-Panic Rust is popular (or even used at all) in Rust libraries currently.

I didn't say you did! Goodness this conversation is super frustrating. I'm not trying to get you to legitimize your opinions by pointing to popularity, but I just want to see some examples of your philosophy actually working in real world scenarios.

> I claim that this technique is widespread in C libraries, and I believe I have demonstrated that.

I have yet to see any such evidence. The C libraries you've shown me have a litany of UB branches where Rust would have panicking branches. None of the C libraries you've linked are coded in the style demonstrated in your blog. If they were, there would be a whole lot more invariant checking (like bounds checks) leading to error codes for those invariant violations.

Instead, the C code primarily just lets UB take over for internal invariant violations. Which may indeed wind up in an abort. Or someone stealing your credit card numbers. ¯\_(ツ)_/¯ That's not at all the style you advocate for in your blog.

The C libraries you link do have some error codes for something resembling internal invariant violations, but from my review, this is not practiced generally and is far more limited than the style you advocate for in your blog.

> Your ripgrep is a perfect example

I specifically did not cite ripgrep as an example. I cited my libraries. I might be best known for my work on ripgrep, but the vast majority of Rust work I've done over the last decade is in libraries. And those are used in all sorts of places.

Moreover, it isn't just my libraries that use this philosophy. It's pretty much all of them, including std.

> Consider the Linux kernel, where a panic takes down your entire computer.

The Linux kernel is one of the few places where I've seen someone argue compellingly for "prefer UB on invariant violations generally, and not panicking." I don't agree with them, but I don't have any practical experience in that specific domain to refute them. Indeed, I view the practice quite skeptically, given that I'd greatly prefer my computer to shut down than, to, say, corrupt my data on disk.

> Or consider a mobile (iOS or Android) application where there is no terminal to dump to, and the user experience of a crash is that the app closes unexpectedly and without explanation. Or consider a web browser where it would be very annoying for an entire tab or browser to crash just because one operation (like using the search box) ran into an internal error. > > In most of these cases, you want to let the program continue if reasonably possible after an error is encountered, while also logging the error for later inspection/diagnosis and possibly telemetry. Probably you will be abandoning any internal state associated with the failing operation.

Your suggestion here amounts to asking Rust libraries to guarantee reasonable and consistent behavior when internal runtime invariants have been broken. That's the only way, "return an error for a broken invariant and otherwise continue on" actually works. I don't see how that's tractable and this is why I ask for examples.

There is nothing you can say that's going to convince me. I have to be shown. Because the fundamental component of my skepticism is seeing the practice in the real world and the kinds of effects it has that are not captured by either your analysis or mine. Indeed, my years of experience building fundamental ecosystem libraries in Rust tells me that your approach does not scale. At all.

I, several comments ago, carefully conceded that the style of Rust you advocate may be useful in niche scenarios. So my position is not, "your philosophy is never useful and it should never be used." My position is, "it is not good idea generally, and it does not match the prevailing convention of C libraries."

I think this conversation has probably run its course. Sincerely, I would like to see examples of your practice more broadly. I want to see how it works and what the real and actual trade-offs are.