Its going to take me a bit to generate several million files but so far I've got a single directory with 550k files in it, it takes 30s to ls it on a very busy system running FreeBSD.
1.1M files -> 120 seconds
1.8M files -> 270 seconds (this could be related to system load being over 90 heh)
Which filesystem you use will also make a big difference here. You could imagine some filesystem that uses the getdirentries(2) binary format for dirents, and that could literally memcpy cached directory pages for a syscall. In FreeBSD, UFS gets somewhat close, but 'struct direct' differs from the ABI 'struct dirent'. And the FS attempts to validate the disk format, too.
FWIW, FreeBSD uses 4kB (x86 system page size) where glibc uses 32kB in this article[1]. To the extent libc is actually the problem (I'm not confident of that yet based on the article), this will be worse than glibc's larger buffer.
The command I quoted works verbatim on one of my Ubuntu systems. It's ~60X faster than eg "for i in $(seq 1 8000000); do touch $i; done" because it creates many files per fork+exec, and fork+exec is a much heavier operation than creating an empty file.
I'm actually not sure why it failed after generating 1.3M files, no error messages or anything, it was weird.
Initially I thought maybe it was like an inode/fd issue or something but no.
ok, unloaded system, 12M files.
Using old SATA 300GB Raptor disks that I had sitting around.
Fairly old E5-2650 CPUs clocked at 1.5Ghz because of power usage, this is single core performance.
"ls" is using 2.5GB of ram, 76 seconds.
"ls -f" is using 2.4GB of ram, 18 seconds.
"ls -mf" uses like 2.4GB of ram, 20 seconds.
For those who say "cache!", no, I pre-warmed the cache and this is the result after that.
There are a few other things that could be related since the original article was about a VM. The VM is going to be affected by SPECTRE/Meltdown patches, a known performance thief. I've got them enabled on this box but I'll disable them shortly and re-test. Also my test box has 64GB of ram and is running FreeBSD 13 with ZFS. I get about 150MB/sec and 1100IOPs with the spinning rust drives.
Its interesting enough that I'm going to run my own test now.