> normalized to Unicode normalization form NFKC I'm wondering why they chose NFK...

lifthrasiir · on Jan 10, 2014

Maybe you are referring the reference [1], which indeed mentions NFKC. As far as I know there is no consensus of the normalization form [2] and the current implementation is not guaranteed to stay, which is why Unicode identifiers are gated behind a `#[feature]` flag.

[1] http://static.rust-lang.org/doc/0.9/rust.html#input-format

[2] https://github.com/mozilla/rust/issues/2253

vorg · on Jan 10, 2014

Yes, I started reading the reference. The normalization form issue is different to the #[non_ascii_idents] feature, though.

The issue 2253 does mention address it, but all the comments mention the issue of NFC/NFKC normalization specifically for filesystem lookup and for program identifiers, but not for the lexing stage. That issue is obviously the best place to continue any conversation about it.

dbaupp · on Jan 10, 2014

(Also, as that bug suggests, we don't actually do any normalisation at all yet.)

dbaupp · on Jan 10, 2014

I don't see any mention of normalisation in the release notes.