I was just thinking about BeOS the other day. If you recall, Be had fantastic multithreading, and could render multiple video streams at the same time. They even gave you a file on the distribution CD to test this capability:
I keep hearing about Be's mt capabilities, but I haven't read anything in detail. Can you point me to some materials that explain how it's different from e.g. Linux?
Yeah... It's been years since I thought about this, but: for linux you basically have two options: fork() and pthreads. Fork creates a new process and IPC between the parent and daughter processes can be cumbersome - and it's not a terribly good system since if you're spawning a process to do a different task, the default state contains overhead of the old task. Pthreads gives you shared memory (and simple mutexes to protect critical data) but that's about it.
Beos' app model discouraged the use of fork() - but you could if you wanted to - and instead had a pthreads-like kernel level support for threads with some very nice C classes that handle message-passing and the like; interthread communication was baked in in too so you could have strong IPC without pipes. BeOS also came with a hybrid spinlock-semaphore mutex called the 'benaphore'.
There were some very excellent C++ wrappers for all of these too, the BLooper class was a by-default message handling loop (for example).
Finally, the threading model was pretty cool, it was a logarithmic stochastic time allocator with a range of "hard priorities" that would give you RT control (but could also freeze your machine if you had an infinite loop). Unfortunately the kernel had a tendency to thrash threads between CPUs in a multi-cpu setup. I think this was fixed in Haiku.
I'm no expert, but you might find the BeBook, which "...details the Application Programming Interface (API) to the BeOS operating system," to be an informative starting point (though you won't find a comparison with Linux there). There are chapters on "Threads and Teams" in the two sections on the "Kernel Kit" (that's as far as I looked, maybe other parts of the book would be more informative).
I never saw the internals, but from a coder's standpoint the thread switches were really fast, with low latency. Compared with Windows NT and OS/2, way faster on the same hardware. I don't have a comparison with Linux, as it was just getting started back then (IIRC, Slackware was the leading distro of the time)
https://www.youtube.com/watch?v=2jMFiRvxvrc