Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But that should definitely be restricted to text data operations, rather than lower-level ones, including filenames.

The issue arises when this "text data" includes filenames. Having café.txt and cafe.txt be equivalent when searching is useful, but the real problem is if a filesystem decides that two "equivalent" filenames are essentially identical - to contrive an example, suppose it thought /étc/passwd was referring to the same file as /etc/passwd . It makes checking for and filtering out "sensitive" filenames far more difficult. For example, just take a look at all the ways Unicode homoglyphs and "special" characters can be used to bypass forum wordfilters, and you'll see how difficult that problem is.

(I know permissions, ACLs, etc. can help here with access control, but the problem of distinguishing between filenames still stands.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: