I think you have a misunderstanding of how disk IO happens. The CPU core sends a...

dapperdrake · 2025-10-24T00:58:59 1761267539

(Not the person you are replying to.)

There is nothing in the sense of Python async or JS async that the OS thread or OS process in question could usefully do on the CPU until the memory is paged into physical RAM. DMA or no DMA.

The OS process scheduler can run another process or thread. But your program instance will have to wait. That’s the point. It doesn’t matter whether waiting is handled by a busy loop a.k.a. polling or by a second interrupt that wakes the OS thread up again.

That is why Linux calls it uninterruptible sleep.

EDIT: io_uring would of course change your thread from blocking syscalls to non-blocking syscalls. Page faults are not a syscall, as GP pointed out. They are, however, a context-switch to an OS interrupt handler. That is why you have an OS. It provides the software drivers for your CPU, MMU, and disks/storage. Here this is the interrupt handler for a page fault.

bcrl · 2025-10-24T01:11:48 1761268308

What everyone forgets is just how expensive context switches are on modern x86 CPUs. Those 512 bit vector registers fill up a lot of cache lines. That's why async tends to win over processes / threads for many workloads.

hyghjiyhu · 2025-10-24T12:26:36 1761308796

(I am the person you are replying to)

It could work like this. "Hey OS I would like to process these pages* are they good to go? If not could you fetch and lock them for me" and then if they are ready you process them knowing it won't fault, and if they are not you do something else and try again later.

It's a sort of hybrid of the mmap and fread paradigms in that there are both explicit read requests but the kernel can also get you data on its own initiative if there are spare resources for it.

* to amortize syscall overhead.

avianlyric · 2025-10-24T23:43:08 1761349388

What advantages does that provide over using more OS threads. Ultimately this model is based on the idea that we want our programming runtimes to become increasingly responsible for low level scheduling concerns that have traditionally been handled by the OS scheduler.

I can broadly understand why there may be a desire to go down that path. But I’m not convinced that it would produce meaningful better performance than the current abstractions. Especially if you take a step back as ask the question: is mmap is the right tool to be using in these situations, rather using other tools like io_uring?

To be clear I don’t know the answer to this question. But the complexity of the solutions being suggested to potentially improve the mmap API really make me question if they’re capable of producing meaningful improvements.

lmz · 2025-10-24T01:17:25 1761268645

It's hard to say on one hand "I use mmap because I don't want fancy APis for every read" and on the other "I want to do something useful on page fault" because you don't want to make every memory read a possible interruption point.

ori_b · 2025-10-24T00:59:29 1761267569

I think you have a misunderstanding of how the OS is signaled about disk I/O being necessary. Most of the post above was discussing that aspect of it, before the OS even sends the command to the disk.