Kidding aside, I don't really see the issue. Do you get upset about in-memory representation of strings often? How about when using Java or Python? Is this not why there is an entire programming practice called "serialization"? Windows started supporting UCS-2 before UTF-8 existed, and so the internal representation on Windows remains 16 bits per char.