The fact that they went out of their way to break python 2 unicode when running on python 3 was just totally nuts. Especially after making such a big deal about unicode!
I've never seen anything like it I don't think? Maybe the new Perl that never really landed?
Imo it's infinitely worse than that.
The big deal about Unicode is its nature, as defined in the "Summary Narrative" from 1991[0]. To wit:
> The Unicode character encoding derives its name from three main goals:
* universal (addressing the needs of world languages)
* uniform (fixed-width codes for efficient access), and
* unique (bit sequence has only one interpretation into character codes)
The Unicode folk realized that it would take decades to shift developers worldwide to doing that properly, so they adopted a three stage plan for software (eg the string types of programming languages) to get from where things were, to where they needed to be:
* Stage #1: Character = byte
* Stage #2: Character = code point
* Stage #3: Character = what a user thinks of as a character[1]
Python 1 was a Stage #1 language -- Character = byte -- like most others of its time.
In Python 2 there were tweaks to try move toward Stage #2 -- Character = code point, again, like most other PLs of its time.
In Python 3, they dictated a full switch to Stage #2 --- Character = code point. That was an unnecessarily painful break relative to Python 2. But -- and this is what really matters -- they entirely ignored Stage #3, which is the whole point of Unicode in the final analysis.
For some reason this was not good enough for the python 3 folks - they actively broke this code which was from folks who had SPECIFICALLy addressed unicode in their apps.
And yes, they could have supported u"" (and a number of other things).
They went - unicode is so critical we will break the world, and then for folks who had already supported unicode well or wanted to dual target a library they said your u"" approach to unicode is so bad we will break it.
Total BS in my book.