I think here is a good argument for not using case-insensitive filesystems - bec...

thedufer · on Dec 19, 2014

> Unicode characters that are visually identical

This was actually a further bug, reported as part of the same CVE - you could also overwrite .git/config by adding any of a number of zero-width Unicode characters that many filesystems ignore when checking for filename equality (but string comparison doesn't, of course).

userbinator · on Dec 19, 2014

http://en.m.wikipedia.org/wiki/Unicode_equivalence

http://en.m.wikipedia.org/wiki/IDN_homograph_attack

What seems really scary about this is that even Unicode has several different ways of comparing strings, and the correct one depends on the exact situation, so the common response of "just use a library" doesn't work; for example, if a user were searching for a filename it might make sense for full-width characters to compare equal to half-width ones, but not if opening a file where you wouldn't want e.g. the full width version of /etc/passwd to be equivalent to the half-width one.

raydev · on Dec 19, 2014

I don't understand. Why can't a library just compare strings at the code-point level, ignoring "canonical equivalence"?

medgno · on Dec 19, 2014

Then you run into problems with how characters are represented. For instance, é (lowercase latin e with an acute accent) can be represented either by one unicode codepoint (U+00E9, 'LATIN SMALL LETTER E WITH ACUTE'), or by two unicode codepoints (U+0065 U+0301 -- LATIN SMALL LETTER E, COMBINING ACUTE ACCENT). There are normalization forms that will convert these two representations into the same representation for easier comparison.

If you don't perform canonical equivalence checking, you could search for "café" and not find a file named "café.txt" if it uses the other representation.

oneeyedpigeon · on Dec 19, 2014

"for example, if a user were searching for a filename"

It's useful when I search for "café", if I also get results for "cafe" - Chrome's search does this. Not to mention searching for "don't" and getting hits including "don’t". But that should definitely be restricted to text data operations, rather than lower-level ones, including filenames.

userbinator · on Dec 19, 2014

But that should definitely be restricted to text data operations, rather than lower-level ones, including filenames.

The issue arises when this "text data" includes filenames. Having café.txt and cafe.txt be equivalent when searching is useful, but the real problem is if a filesystem decides that two "equivalent" filenames are essentially identical - to contrive an example, suppose it thought /étc/passwd was referring to the same file as /etc/passwd . It makes checking for and filtering out "sensitive" filenames far more difficult. For example, just take a look at all the ways Unicode homoglyphs and "special" characters can be used to bypass forum wordfilters, and you'll see how difficult that problem is.

(I know permissions, ACLs, etc. can help here with access control, but the problem of distinguishing between filenames still stands.)

shanemhansen · on Dec 19, 2014

Linus actually has a great rant about brain-dead filesystems that mangle people's data. It's eerily prescient. http://thread.gmane.org/gmane.comp.version-control.git/70688...

userbinator · on Dec 19, 2014

He even mentioned the security aspect here:

http://article.gmane.org/gmane.comp.version-control.git/7076...

"Having programs that get different results back from what they actually wrote, that tends to be a security issue"

Someone · on Dec 19, 2014

I think the main argument against case-insensitivity is that every file system driver must contain a large [1] code blob that does the case-insensitive comparison. That blob cannot be shared between drives or with the OS because one must guarantee that it stays the same forever. It is almost a sure bet that case-insensitivity in NTFS is different from that in HFS+ (even disregarding their different canonicalization).

Directory-traversal attacks work just as well with ASCII or byte sequences.

[1] of course, what is large becomes less and less important over time, thanks to Moore's law. On embedded devices, this still may be quite significant, though.

rimantas · on Dec 19, 2014

I once tried to install OS X with case sensative filesystem. Turns out Photoshop for OS X does not work in that case. Had to go back to case-insensitive. Not sure how many other apps have the same issue.

kdeldycke · on Dec 19, 2014

Also had issues with steam. The solution ? Install it in a disk image: https://github.com/kdeldycke/dotfiles/commit/05cef3c1de4a208...

wereHamster · on Dec 19, 2014

World of Warcraft (and any other Blizzard game) won't run off of a case sensitive filesystem.

ydant · on Dec 19, 2014

Steam also requires case-insensitive filesystem on OSX.

hereonbusiness · on Dec 19, 2014

Steam on Linux does not. That probably means that Linux doesn't get any steam games that require a case-insensitive fs even if they would work otherwise (unity3d, monogame, ...), at least if there isn't a special linux version that does not require it.

I never really thought about this issue.

hjnilsson · on Dec 19, 2014

The issue here is not case-insensitive filesystems, they are a huge benefit to novice users. But that the type system does not distinguish between paths and strings. A path is distinctly different from a string, and should never be compared as one. The type system should always enforce this and never allow you to mistakingly do the comparison you propose, for exactly the reasons you state. Modern filesystem libraries (for type-safe languages) do this, the problem is (as is becoming more and more common lately) the abundance of old tools that were not designed with security in mind.

couchand · on Dec 19, 2014

As you point out, a path is different from a string. It makes sense in many cases to do a case-insensitive comparison on strings. It NEVER makes sense to do that on paths.

A path is an identifier. Just as it's infuriating and horrendous to have programming language identifiers act case-insensitively, it's a poor design choice to have path identifiers act case-insensitively.

I suspect that what you describe as a huge benefit to novice users is entirely seen in search functionality, where you are comparing a string (the search needle) against a number of paths (the haystack). Since this search should be conducted by converting the path to a string (never the other way around, that's nonsensical) it's perfectly natural to support CI operations on strings but forbid them on paths.

spdegabrielle · on Dec 19, 2014

so true

archagon · on Dec 19, 2014

The OS should always be in service of the user. And users (in my experience) vastly prefer case insensitivity.

Sorry, but there's just no way around it.

oneeyedpigeon · on Dec 19, 2014

Can you elaborate on that experience? I would have thought users not typing filenames wouldn't care less, and that users who are typing filenames would tend to be developers or 'advanced' users who aren't confused by case-sensitivity, and recognise the advantages.

userbinator · on Dec 19, 2014

That's how I see it too - inexperienced users will only be typing in filenames to name new files and likely use a GUI filechooser for selecting existing ones, while the ones typing in filenames are probably using CLIs.

The common complaint "but it's annoying to have to type in the filenames exactly" can be solved with tab-completion (I'm surprised how many CLI users don't know their shell has this feature), and using sane naming conventions like not GiVinG yOuR fIleS_stUpiD-NaMESLikEth1s.

spdegabrielle · on Dec 19, 2014

Users shouldn't have access to the filesystem. The usability issues are too significant even for competent users. A good laugh/example: http://xkcd.com/1459/

npizzolato · on Dec 20, 2014

The length at which I have to go to remember how to get stuff onto an ipad through itunes is an exercise that always reminds me that I always want access to the filesystem, even if I don't normally use it.