> I don't know what one should even make from that statement. it's just a fact. ...

zozbot234 · 2025-12-10T12:20:38 1765369238

> technically Rust is still safer than C in its unsafe blocks

This is quite dubious in a practical sense, since Rust unsafe blocks must manually uphold the safety invariants that idiomatic Safe Rust relies on at all times, which includes, e.g. references pointing to valid and properly aligned data, as well as requirements on mutable references comparable to what the `restrict` qualifier (which is rarely used) involves in C. In practice, this is hard to do consistently, and may trigger unexpected UB.

Some of these safety invariants can be relaxed in simple ways (e.g. &Cell<T> being aliasable where &mut T isn't) but this isn't always idiomatic or free of boilerplate in Safe Rust.

Xylakant · 2025-12-10T13:43:36 1765374216

It's great that the Google Android team has been tracking data to answer that question for years now and their conclusion is:

-------

The primary security concern regarding Rust generally centers on the approximately 4% of code written within unsafe{} blocks. This subset of Rust has fueled significant speculation, misconceptions, and even theories that unsafe Rust might be more buggy than C. Empirical evidence shows this to be quite wrong.

Our data indicates that even a more conservative assumption, that a line of unsafe Rust is as likely to have a bug as a line of C or C++, significantly overestimates the risk of unsafe Rust. We don’t know for sure why this is the case, but there are likely several contributing factors:

    unsafe{} doesn't actually disable all or even most of Rust’s safety checks (a common misconception).
    The practice of encapsulation enables local reasoning about safety invariants.
    The additional scrutiny that unsafe{} blocks receive.

-----

From https://security.googleblog.com/2025/11/rust-in-android-move...

zozbot234 · 2025-12-10T15:44:55 1765381495

> The practice of encapsulation enables local reasoning about safety invariants.

> The additional scrutiny that unsafe{} blocks receive.

None of this supports an argument that "unsafe Rust is safer than C". It's just saying that with enough scrutiny on those unsafe blocks, the potential bugs will be found and addressed as part of development. That's a rather different claim.

Xylakant · 2025-12-10T17:48:12 1765388892

It does, if you read the report and run a little (implied) math.

The report says that their historical data gives them an estimate of 1000 Memory Safety issues per Million Lines of Code for C/C++.

The same team currently has 5 Million lines of Rust code, of which 4% are unsafe (200 000). Assuming that unsafe Rust is on par with C/C++, this gives us an expected value of about 200 memory safety issues in the unsafe code. They have one. Either they have 199 hidden and undetected memory safety issues, or the conclusion is that even unsafe Rust is orders of magnitude better than C/C++ when it comes to memory safety.

I trust them to track these numbers diligently. This is a seasoned team building foundational low level software. We can safely assume that the Android team is better than the average C/C++ programmer (and likely also than the average Rust programmer), so the numbers should generalize fairly well.

Part of the benefits of Rust is indeed that it allows local reasoning about crucial parts of the code. This does allow for higher scrutiny which will find more bugs, but that's a result of the language design. unsafe {} was designed with that im mind - this is not a random emergent property.

menaerus · 2025-12-10T18:03:41 1765389821

They say "With roughly 5 million lines of Rust in the Android platform and one potential memory safety vulnerability found (and fixed pre-release), our estimated vulnerability density for Rust is 0.2 vuln per 1 million lines (MLOC).".

Do you honestly believe that there is 1 vulnerability per 5 MLoC?

Xylakant · 2025-12-10T18:22:30 1765390950

1 memory safety vulnerability, that's a pretty important distinction.

Yes, I believe that at least the order of magnitude is correct because 4 800 000 of those lines are guaranteed to not have any by virtue of the compiler enforcing memory safety.

So it's 1 per 200 000, which is 1-2 orders of magnitude worse, but still pretty darn good. Given that not all unsafe code actually has potential for memory safety issues and that the compiler still will enforce a pretty wide set of rules, I consider this to be achievable.

This is clearly a competent team that's writing important and challenging low-level software. They published the numbers voluntarily and are staking their reputation on these reports. From personal observation of the Rust projects we work on, the results track with the trend.

There's no reason for me to disbelieve the numbers put forward in the report.

menaerus · 2025-12-10T18:30:24 1765391424

1 per 5M or 1 per 200K is pretty much unbelievable, especially in such a complex codebase, so all I can say then is to each their own.

littlestymaar · 2025-12-10T20:07:38 1765397258

> especially in such a complex codebase

You accidentally put the finger on the key point, emphasis mine.

When you have a memory-unsafe language, the complexity of the whole codebase impact your ability to uphold memory-related invariants.

But unsafe block are, by definition, limited in scope and assuming you design your codebase properly, they shouldn't interact with other unsafe blocks in a different module. So the complexity related to one unsafe block is in fact contained to his own module, and doesn't spread outside. And that makes everything much more tractable since you never have to reason about the whole codebase, but only about a limited scope everytime.

menaerus · 2025-12-11T06:41:45 1765435305

No, this is just an example of confirmation bias. You're given a totally unrealistic figure of 1 vuln per 200K/5M LoC and now you're hypothesizing why that could be so. Google, for anyone unbiased, lost the credibility when they put this figure into the report. I wonder what was their incentive for doing so.

> But unsafe block are, by definition, limited in scope and assuming you design your codebase properly, they shouldn't interact with other unsafe blocks in a different module. So the complexity related to one unsafe block is in fact contained to his own module, and doesn't spread outside. And that makes everything much more tractable since you never have to reason about the whole codebase, but only about a limited scope everytime.

For anyone who has written low-level code with substantial complexity knows that this is just a wishful thinking. In such code, abstractions fall-apart and "So the complexity related to one unsafe block is in fact contained to his own module, and doesn't spread outside" is just wrong as I explained in my other comment here - UB taking place in unsafe section will transcend into the rest of the "safe" code - UB is not "caught" or put into the quarantine with some imaginative safety net at the boundary between the safe and unsafe sections.

Xylakant · 2025-12-11T08:57:43 1765443463

Let's take a simple example to illustrate how unsafe {} cuts down the review effort for many operations. Take a static mutable global variable (a global counter for example). Reading a static is safe, mutating it (increasing the counter) is not - it requires an unsafe {} block.

If you need to check which places mutate this global static you only need to check the unsafe parts of the code - you know that no other part of your code could mutate it, the compiler won't let you. If you have a bug that is related to mutating this static, then it might manifest anywhere in your code. But you know for certain that the root cause must be in one of your unsafe blocks - even if you don't know which one.

Good programming practice will cut down that effort even more by dictating that unsafe access should be grouped in modules. For example when binding to a C module (unsafe) you'd usally generate an unsafe wrapper with bindgen and then write a safe wrapper on top of that. Any access that tries to go around the safe wrapper would be frowned upon and likely fail review.

And again, the compiler will help you there: Any access that tries to bypass the safe api would need to be unsafe {} again and automatically receive extra scrutiny in a review, making it less likely to slip through.

Compare that to a C codebase where anything goes. A static might be mutated anywhere in your codebase, even through a pointer to it - meaning you can't even reliably grep for it. It may slip through review unnoticed because no attention is drawn to it and cause bugs that are hard to trace and reason about.

If you're writing embedded code, similar considerations apply - access to registers etc. require unsafe {}. But because access is unsafe {}, it's usually gated behind a safe api that is the boundary of the low-level code and the higher buisness logic. Unsurprisingly, these are critical parts of the code - hence they receive extra scrutiny and in our project, we allocate substantial review capacity on those. And the compiler will enforce that no safe code can circumvent the access layer.

The number you're tagging as unrealistic figure is the result of dedicated and careful design of language and compiler features to achieve exactly this outcome. It's not a random fluke, very clever people did sit down and thought about how to achieve this.

sunshowers · 2025-12-11T09:06:38 1765443998

You sound like my teacher in school, after my friend submitted an assignment so good my teacher thought he was cheating.

littlestymaar · 2025-12-11T07:13:41 1765437221

> You're given a totally unrealistic figure of 1 vuln per 200K/5M LoC and now you're hypothesizing why that could be so.

You are the one claiming it's unrealistic. And you gave zero argument why besides “the codebase is complex”, which I refuted. See the definition of complexity:

> The term is generally used to characterize something with many parts where those parts interact with each other in multiple ways, culminating in a higher order of emergence greater than the sum of its parts

Each unsafe block may be “difficult” in itself, but the resulting system isn't “complex” because you don't have this compounding effect.

> I wonder what was their incentive for doing so.

And obviously it must be malice…

> For anyone who has written low-level code with substantial complexity knows that this is just a wishful thinking. In such code, abstractions fall-apart and "So the complexity related to one unsafe block is in fact contained to his own module, and doesn't spread outside" is just wrong as I explained in my other comment here - UB taking place in unsafe section will transcend into the rest of the "safe" code - UB is not "caught" or put into the quarantine with some imaginative safety net at the boundary between the safe and unsafe sections.

I think you don't understand the problem as well as you think you do. Of course if the UB happens then all bets are off! Its consequences won't be limited to a part of the code, by definition. And nobody said otherwise.

But for the UB to happen, there must be some violation of an memory invariant (the most common would be using a value after free, freeing twice, accessible the same memory from multiple threads without synchronization or, and this is specific to Rust, violating reference aliasing rules).

To avoid violating these invariants, the programmer must have a mental model of the ownership over all the system on which these invariants apply. For C or C++, it means having a mental model of all the code base, because the invariants related to one piece of code can be violated from everywhere.

In Rust this is different, you're not going to have raw pointers to one piece of data being used in multiple parts of the code (well, if you really want, nobody stops you, but I'm confident the Android team didn't). And as such, you'll have to think about the invariants only at the scale of one module. Building an accurate mental model of a 350-line module is much more tractable for a human than doing the same for an entire codebase, and it's not even close.

Xylakant · 2025-12-10T20:18:49 1765397929

That's the other interesting observation you can draw from that report. The numbers contained in the first parts about review times, rollback rates, etc. are broken down by change size. And the gap widens for larger changes. This indicates that Rusts language features support reasoning about complex changesets.

It's not obviously clear to me which features are the relevant ones, but my general observation is that lifetimes, unsafe blocks, the borrow checker allow people to reason about code in smaller chunks. For example knowing that there's only one place where a variable may be mutated supports understanding that at the same time, no other code location may change it.

baq · 2025-12-10T17:17:30 1765387050

literally from the quote:

    unsafe{} doesn't actually disable all or even most of Rust’s safety checks (a common misconception).

menaerus · 2025-12-10T18:10:50 1765390250

They also say

  The practice of encapsulation enables local reasoning about safety invariants.

which is not fully correct. Undefined behavior in unsafe blocks can and will leak into the safe Rust code so there is nothing there about the "local reasoning" or "encapsulation" or "safety invariants".

This whole blog always read to me as too much like a marketing material disguised with some data so that it is not so obvious. IMHO

aw1621107 · 2025-12-10T18:59:18 1765393158

> which is not fully correct. Undefined behavior in unsafe blocks can and will leak into the safe Rust code so there is nothing there about the "local reasoning" or "encapsulation" or "safety invariants".

Strictly speaking, that encapsulation enables local reasoning about safety invariants does not necessarily imply that encapsulation guarantees local reasoning about safety invariants. It's always possible to write something unadvisable, and no language is capable of preventing that.

That being said, I think you might be missing the point to some extent. The idea behind the sentence is not to say that the consequences of a mistake will not be felt elsewhere. The idea is that when reasoning about whether you're upholding invariants and/or investivating something that went wrong, the amount of code you need to look at is bounded such that you can ignore everything outside those bounds; i.e., you can look at some set of code in complete isolation. In the most conservative/general case that boundary would be the module boundary, but it's not uncommon to be able to shrink those boundaries to the function body, or potentially even further.

This general concept here isn't really new. Rust just applied it in a relatively new context.

menaerus · 2025-12-10T19:51:42 1765396302

Yes, but my point is when things blow up how exactly do you know which unsafe block you should look into? From their statement it appears as if there's such a simple correlation between "here's your segfault" and "here's your unsafe block that caused it", and which I believe there isn't, and which is why I said there's no encapsulation, local reasoning etc.

Xylakant · 2025-12-10T20:22:23 1765398143

But you know that it's one of them. That cuts 96% of their codebase out, even assuming you have no idea which one it could be.

aw1621107 · 2025-12-10T21:28:42 1765402122

> Yes, but my point is when things blow up how exactly do you know which unsafe block you should look into?

In the most general case, you don't. But again, I think that that rather misses the point the statement was trying to get at.

Perhaps a more useful framing for you would be that in the most general case the encapsulation and local reasoning here is between modules that use unsafe and everything else. In some (many? most?) cases you can further bound how much code you need to look at if/when something goes wrong since not all code in unsafe modules/functions/blocks depend on each other, but in any case the point is that you only need to inspect a subset of code when reasoning about safety invariants and/or debugging a crash.

> From their statement it appears as if there's such a simple correlation between "here's your segfault" and "here's your unsafe block that caused it",

I don't get that sense from the statement at all.

menaerus · 2025-12-11T07:12:11 1765437131

> in the most general case the encapsulation and local reasoning here is between modules that use unsafe and everything else

This would be the same narrative as in, let's say, C++. Wrap the difficult and low-level memory juggling stuff into "modules", harden the API, return the references and/or smart-pointers, and then just deal with the rest of the code with ease, right? Theoretically possible but practically impossible.

First reason is that abstractions get really leaky, and they especially get really leaky in the code that demands the upmost performance. Anyone who implemented their own domain/workload specific hash-map or mutex or anything similarly foundational will understand this sentiment. Anyway, if we just have a look into the NVMe driver above, there're no "unsafe modules".

Second, and as I already argued, UB in the module library transcends into the rest of your code so I fail to understand how is it so that the dozens of unsafe sections make the reasoning or debugging any more simpler when reasoning is actually not a function of number of unsafe sections but it is the function of interactions between different parts of the code that end up touching the memory in the unsafe block in a way that it was not anticipated. This is almost always the case when dealing with undefined behavior.

> I don't get that sense from the statement at all.

It is a bit exaggerated example of mine but I do - their framing suggests ~exactly that and which is simply not true.

aw1621107 · 2025-12-11T17:15:04 1765473304

> This would be the same narrative as in, let's say, C++. Wrap the difficult and low-level memory juggling stuff into "modules", harden the API, return the references and/or smart-pointers, and then just deal with the rest of the code with ease, right? Theoretically possible but practically impossible.

The difference, of course, is in the amount of automated help/enforcement provided that makes it harder/impossible to misuse said API. Just like C++ provides new functionality compared to C that makes it hard-to-impossible to misuse APIs in certain ways (RAII, stronger type system, etc.), and how C does the same compared to assembly (providing structured control flow constructs, abstracting away low-level details like calling convention/register management, etc.), Rust provides new functionality that previous widespread languages didn't. It's those additional capabilities that make previously difficult things practical.

> so I fail to understand how is it so that the dozens of unsafe sections make the reasoning or debugging any more simpler when reasoning is actually not a function of number of unsafe sections but it is the function of interactions between different parts of the code that end up touching the memory in the unsafe block in a way that it was not anticipated.

...Because the way encapsulation works in practice is that only a subset of code can "touch[] the memory in the unsafe block in a way that it was not anticipated" in the first place? That's kind of the point of encapsulation!

(I say "in practice" because Rust doesn't and can't stop you from writing unsafe APIs, but that's going to be true of any language due to Rice's Theorem, the halting problem, etc.)

As a simple example, say you have a program with one singular unsafe block encapsulated in one single function which is intended to provide a safe API. If/when UB happens the effects can be felt anywhere, but you know the bug is within the encapsulation boundary - i.e., in the body of the function that wraps the unsafe block, even if the bug is not in the unsafe block itself (well, either that or a compiler bug but that's almost always not going to be the culprit). That certainly seems to me like it'd be easier to debug than having to reason about the entire codebase.

This continues to scale up to multiple functions which provide either a combined or independent API to internal unsafe functionality, whole modules, etc. Sure, debugging might be more difficult than the single-function case due to the additional possibilities, but the fact remains that for most (all?) codebases the amount of code responsible for/causing UB will reside behind said boundary and is going to be a proper subset of the all the code in the project.

And if you take this approach to the extreme, you end up with formal verification programs/theorem provers, which isolate all their "unsafe" code to a relatively small/contained trusted kernel.Even there, UB in the trusted kernel can affect all parts of compiled programs, but the point is that if/when something goes wrong you know the issue is going to be in the trusted kernel, even if you don't necessarily know precisely where.

> It is a bit exaggerated example of mine but I do - their framing suggests ~exactly that and which is simply not true.

I do agree that that the claim in that particular interpretation of that statement is wrong (and Rust has never offered such a correlation in the first place), but it's kind of hard to discuss beyond that if I don't interpret that sentence the same way you do :/

sophacles · 2025-12-10T16:42:14 1765384934

It actually does support it. Human attention is a finite resource. You can spend a little bit if attention in every line to scrutinize safety or you can spend a lot of time scrutinizing the places where you can't mechanically guarantee safety.

It's safer because it spends the human attention resource more wisely.

menaerus · 2025-12-10T12:31:06 1765369866

So, unsafe block every 70 LoC in 1500 LoC toy example? Sure, it's a strong argument.

kaoD · 2025-12-10T14:53:09 1765378389

How is that worse than everything being unsafe?

I've seen this argument thrown around often here in HN ("$IMPROVEMENT is still not perfect! So let's keep the statu quo.") and it baffles me.

C is not perfect and it still replaced ASM in 99% of its use cases.

menaerus · 2025-12-10T16:47:18 1765385238

No, I am not saying keep the status quo. I am simply challenging the idea that kernel will enjoy benefits that is supposed to be provided by Rust.

Distribution of bugs across the whole codebase is not following the normal distribution but multimodal. Now, imagine where the highest concentration of bugs will be. And how many bugs there will be elsewhere. Easy to guess.

kaoD · 2025-12-10T16:56:25 1765385785

> Now, imagine where the highest concentration of bugs will be. And how many bugs there will be elsewhere.

You're doing it again!

Doesn't matter where the majority of bugs will be. If you avoid the minority it's still an improvement.

Also, Rust safety is not related at all to bugs. You seem to have a misunderstanding of what Rust is or what safe Rust provides.

(Also, I'd challenge the rest of your assumptions, but that's another story.)

menaerus · 2025-12-10T17:07:57 1765386477

What am I exactly doing again? I am providing my reasoning, sorry if that itches you the wrong way. I guess you don't have to agree but let me express my view, ok? My view is not extremist or polarized as you see. I see the benefit of Rust but I say the benefit is not what Internet cargo-cult programming suggests. There's always a price to be paid, and in case of kernel development I think it outweighs the positive sides.

If I spend 90% of time debugging freaking difficult to debug issues, and Rust solves the other 10% for me, then I don't see it as a good bargain. I need to learn a completely new language, surround myself with a team which is also not hesitant to learn it, and all that under assumption that it won't make some other aspects of development worse. And for surely it will.