Inventing your own pseudo-normalization of Unicode is a worse idea than using th...

cryptonector · on Dec 26, 2018

There's nothing pseudo about it. To normalize both inputs first then compare, or normalize one character at a time and compare that is equivalent. There is a maximum number of codepoints in a canonical decomposition (or at least there used to be).

This is actually implemented in ZFS. (And also character-at-a-time normalization for hashing.)

I don't see how homoglyphs enter the picture. Can you explain?