Well, except for Windows (which insists on UTF-16/UCS2), as far as I can see we are at that point. Modern Linux distros have come configured with everything set up with UTF-8 for years. Fedora has been that way since Fedora Core 1.
I don't know what OSX is doing.. I would assume that normally it is pre-configured with UTF-8, but somehow that was lost in his setup. I don't have an OSX machine to confirm this though.
It is, but all POSIX operating systems are brain-damaged when it comes to things other than interactive terminal windows – they default to LANG=C so e.g. the same script which runs interactively crashes when you run it in cron or some other launcher.
Hmm, is OXS still actually POSIX compliant? I thought they haven't bothered with certification for the past few versions.
Anyway, if a changed LANG breaks your script, I'd primarily place the blame on the script. You should be able to shuffle around UTF-8 bytes non-interactively without paying much mind to that.
The reason the Linux distributions don't set UTF-8 in the default environment is for backwards (bugwards?) compatibility with legacy code where e.g. something like Python/Perl/etc. might start actually throwing exceptions if it thinks you want UTF-8 and you gave it either a different encoding or just unvalidated garbage.
There's no solution to this which will make everyone happy.
I prefer to set UTF-8 for everything so I can find things which break and fix them but a lot of legacy shops choose not to spend the time fixing things which are “working”.
I don't know what OSX is doing.. I would assume that normally it is pre-configured with UTF-8, but somehow that was lost in his setup. I don't have an OSX machine to confirm this though.