undefined | Better HN

0 pointsint_19h4y ago0 comments

Python 3 should be considered a different language from Python 2, pretty much.

0 comments

The level on intentional compatibility breaking was crazy.

The fact that they went out of their way to break python 2 unicode when running on python 3 was just totally nuts. Especially after making such a big deal about unicode!

I've never seen anything like it I don't think? Maybe the new Perl that never really landed?

raiph4y ago

> The fact that they went out of their way to break python 2 unicode when running on python 3 was just totally nuts. Especially after making such a big deal about unicode!

Imo it's infinitely worse than that.

The big deal about Unicode is its nature, as defined in the "Summary Narrative" from 1991[0]. To wit:

> The Unicode character encoding derives its name from three main goals:

* universal (addressing the needs of world languages)

* uniform (fixed-width codes for efficient access), and

* unique (bit sequence has only one interpretation into character codes)

The Unicode folk realized that it would take decades to shift developers worldwide to doing that properly, so they adopted a three stage plan for software (eg the string types of programming languages) to get from where things were, to where they needed to be:

* Stage #1: Character = byte

* Stage #2: Character = code point

* Stage #3: Character = what a user thinks of as a character[1]

Python 1 was a Stage #1 language -- Character = byte -- like most others of its time.

In Python 2 there were tweaks to try move toward Stage #2 -- Character = code point, again, like most other PLs of its time.

In Python 3, they dictated a full switch to Stage #2 --- Character = code point. That was an unnecessarily painful break relative to Python 2. But -- and this is what really matters -- they entirely ignored Stage #3, which is the whole point of Unicode in the final analysis.

[0] https://www.unicode.org/history/summary.html

[1] https://unicode.org/glossary/#grapheme

erik_seaberg4y ago

They’re still working on Raku, it just isn’t branded “Perl 6” anymore.

lizmat4y ago

That "new" Perl landed in 2015 as "Perl 6". It has been renamed to The Raku Programming Language in 2019 (https://raku.org #rakulang).

int_19hOP4y ago

Well, there were good reasons to change the way py2 worked with Unicode, and I don't think this could have been made right without severe breakage either way. The only thing that should have happened sooner is support for u"" literals.

slownews454y ago

I had developed with unicode in mind using u"".

For some reason this was not good enough for the python 3 folks - they actively broke this code which was from folks who had SPECIFICALLy addressed unicode in their apps.

And yes, they could have supported u"" (and a number of other things).

They went - unicode is so critical we will break the world, and then for folks who had already supported unicode well or wanted to dual target a library they said your u"" approach to unicode is so bad we will break it.

Total BS in my book.

j / k navigate · click thread line to collapse

0 comments

slownews454y ago

The level on intentional compatibility breaking was crazy.

The fact that they went out of their way to break python 2 unicode when running on python 3 was just totally nuts. Especially after making such a big deal about unicode!

I've never seen anything like it I don't think? Maybe the new Perl that never really landed?

raiph4y ago

> The fact that they went out of their way to break python 2 unicode when running on python 3 was just totally nuts. Especially after making such a big deal about unicode!

Imo it's infinitely worse than that.

The big deal about Unicode is its nature, as defined in the "Summary Narrative" from 1991[0]. To wit:

> The Unicode character encoding derives its name from three main goals:

* universal (addressing the needs of world languages)

* uniform (fixed-width codes for efficient access), and

* unique (bit sequence has only one interpretation into character codes)

* Stage #1: Character = byte

* Stage #2: Character = code point

* Stage #3: Character = what a user thinks of as a character[1]

Python 1 was a Stage #1 language -- Character = byte -- like most others of its time.

In Python 2 there were tweaks to try move toward Stage #2 -- Character = code point, again, like most other PLs of its time.

[0] https://www.unicode.org/history/summary.html

[1] https://unicode.org/glossary/#grapheme

erik_seaberg4y ago

They’re still working on Raku, it just isn’t branded “Perl 6” anymore.

lizmat4y ago

That "new" Perl landed in 2015 as "Perl 6". It has been renamed to The Raku Programming Language in 2019 (https://raku.org #rakulang).

int_19hOP4y ago

slownews454y ago

I had developed with unicode in mind using u"".

For some reason this was not good enough for the python 3 folks - they actively broke this code which was from folks who had SPECIFICALLy addressed unicode in their apps.

And yes, they could have supported u"" (and a number of other things).

Total BS in my book.

j / k navigate · click thread line to collapse