* rawmemchr - it might be faster as it doesn't have to decrement the size_t but this only mildly relevant for lwan as many instances of rawmemchr use are simply rawmemchr(ptr, '\0') which is exactly the same as ptr + strlen(ptr) + 1 and even less optimized.
* pthread_tryjoin_np - __linux__ is defined by gcc not glibc, you should check for __GLIBC__ if you want to use glibc specific functions.
* underscore prefixed functions - pedantic I know but it is reserved for the implementation.
These are easy things to fix: feel free to issue a few pull requests. :)
Regarding rawmemchr(): both are pretty well optimized. Both are implemented in glibc using the same technique (reading a byte at a time until it is aligned, then moving to multibyte reads). strlen() might be faster, yes, considering that the implementation can hardcode some magic numbers. In other words: some micro benchmarks might help decide here.
Regarding __linux__ vs. __GLIBC__: Lwan works with some alternative libcs (such as uClibc), so relying on __GLIBC__ being defined for things like this doesn't seem like a good idea. In any case, since Lwan isn't portable anyway, one can just assume it is always running on Linux and get rid of these #ifdefs.
* rawmemchr - it might be faster as it doesn't have to decrement the size_t but this only mildly relevant for lwan as many instances of rawmemchr use are simply rawmemchr(ptr, '\0') which is exactly the same as ptr + strlen(ptr) + 1 and even less optimized.
* pthread_tryjoin_np - __linux__ is defined by gcc not glibc, you should check for __GLIBC__ if you want to use glibc specific functions.
* underscore prefixed functions - pedantic I know but it is reserved for the implementation.