In case anyone else is confused as to why the domain in the example provided needs to be unicode (compared to the filename which is obvious): it's because the hyphen is the shorter '‑' char, which is extended ASCII 226 not the standard '-' (which would be ASCII 45).
The first character you pasted is U+2011 (8209 in decimal), does not appear in the document and cannot be ASCII as it goes beyond the codepoint 127/7F. Also, U+2011 is meant to be a non-breaking hyphen.