undefined | Better HN

0 pointsmumblemumble5y ago0 comments

I imagine it wouldn't have been that hard, if the authors had just used ICU4C for string handling. I've had good luck with just converting all input to normal form C. The bigger challenge there is that, if you're using ICU strings, then you've lost the ability to use any library that is designed to work with C strings. There's no way to avoid 0x00 showing up in the middle of UTF-16 and UTF-32 strings, and, even if you use modified UTF-8 to avoid NUL bytes, you still break the assumption that a string's length is equal to its size is bytes.

0 comments

1 comments · 1 top-level

ufo5y ago

Not to mention the inherent problem of adding a dependency to an external library.

I know that in the case of Lua that was one of the main reasons. ICU by itself would be larger than the rest of the interpreter.

j / k navigate · click thread line to collapse