1. Unicode support was actually an anti-feature for most existing code. If you're writing a simple script you prefer 'garbage-in, garbage-out' unicode rather than scattering casts everywhere to watch it randomly explode when an invalid byte sneaks in. If you did have a big user-facing application that cared about unicode, then the conversion was incredibly painful for you because you were a real user of the old style.
2. Minor nice-to-haves like print-function, float division, and lazy ranges just hide landmines in the conversion while providing minimal benefit.
In the latest py3 versions we've finally gotten some sugar to tempt people over: asyncio, f-strings, dataclasses, and type annotations. Still not exactly compelling, but at least something to encourage the average Joe to put in all the effort.
Actually that's the behavior of python 2, it works fine, until you send invalid characters then it blows up.
In python 3 it always blows up when you mix bytes with text so you can catch the issue early on.
> In the latest py3 versions we've finally gotten some sugar to tempt people over: asyncio, f-strings, dataclasses, and type annotations. Still not exactly compelling, but at least something to encourage the average Joe to put in all the effort.
That's because until 2015 all python 2.7 features were from python 3. Python 2.7 was basically python 3 without the incompatible changes. After they stopped backporting features in 2015. Suddenly python 3 started looking more attractive.
> In python 3 it always blows up when you mix bytes with text so you can catch the issue early on.
Sometimes you don't care about weird characters being print as weird things. In python 2 it works fine: you receive garbage, you pass garbage. In python 3 it shuts down your application with a backtrace.
Dealing with this was one of my first Python experiences and it was very frustrating, because I realized that simply using #!/usr/bin/python2 would solve my problem but people wanted python3 just because it was fancier. So we played a lot of whack-a-mole to make it not explode regardless of the input. And the documentation was particularly horrible regarding that, not even the experienced pythoners knew how to deal with it properly.
This is definitely the case. I've been wrestling with bytes and strings all the time during the port of a Django application to Python 3 for a costumer. I can see myself encoding and decoding response bodies and JSON for the time being. For reasons I didn't investigate I don't have to do that with projects in Ruby and Elixir. It seems everything is a string there and yet they work.
Not that I've seen.
Example of where Python 3 has rained shit on my parade: I wrote a program that backs up files for Linux. It works fine in python 2, but in python 3 you rapidly learn you must treat filenames as bytes otherwise your backup program blows up on valid Linux filenames. It's not just decoding errors, it's worse. Because Unicode doesn't have a unique encoding for each string, so the round trip (binary -> string -> binary) is not guaranteed to get you the same binary. If you make the mistake of using that route (which Python3 does by default) then one day Python3 will tell you can't open a file you os.listdir() microseconds ago and can clearly see is still there.
Later, you get some sort of error when handling one of those filenames, so you sys.stderr.write('%s: this file has an error' % (filename,)). That worked in python2 just fine, but in python3 generates crappy looking error messages even for good filenames. You can't try to decode the filename to a string because it might generate a coding error. This works: sys.write('b%b: this file has an error' % (filename,)), but then you find you've inserted other strings into error messages and soon the only "sane" thing to do is to to convert every string in your program to bytes. Other solutions like sys.write('%s: this file has an error' % (filename.decode(errors='ignore'),)) but corrupt the filename the user sees, are verbose, and worst of all if you forget it isn't caught by unit tests but still will cause your program to blow up in rare instances.
I realise that for people who live in a land of clearly delineated text and binary, such as the django user posting here, these issues never arise and the clear delineation between text and bytes is a bonus. But people who use python2 as a better bash scripting language than bash don't live in that world. For them python2 was a better scripting language than bash, but is being being depreciated in favour of python3 that's actually more fragile than bash for their use case. (That's a pretty impressive "accomplishment".) Perhaps they will go to back to Perl or something, because it stands Python3 isn't a good replacement.
Not always. As far as I can tell writing garbage bytes to various APIs works fine unless they explicitly try to handle encoding issues. First time I noticed encoding issues in my code was when writing an xml structure failed on windows, all because of an umlaut in an error message I couldn't care less about. The solution was to simply kill any non ascii character in the string, not a nice or clean solution but the issue wasn't worth more effort.
> In python 3 it always blows up when you mix bytes with text so you can catch the issue early on.
That is nice if your job involves dealing with unicode issues. My job doesn't, any time I have to deal with it despite that is time wasted.
We're talking about simple scripts, the solution is to not send in invalid characters.
Personally, asyncio and type annotations are a big turnoff. I know this is a bit contrarian, but I've always favored the greenlet/gevent approach to doing cooperative multi-tasking. Asyncio (neé twisted) had a large number of detractors, but now that the red/blue approach has been blessed, it seems like many are just swallowing their bile and using it.
Type annotations really chafe because they seem so unpythonic. I like using python for it's dynamicity, and for the clean, simple code. Type annotations feel like an alien invader, and make code much more tedious to try and read. If I want static typing, I'll use a statically typed language.
No one wants to spend energy re-programming to stay in place.
Especially APIs.
- run 2to3
- spend 2h max fixing any failing tests
- cook of any remaining issues in a few days of beta testing like you'd do for any new release
Now now doubt Python 2.7 is a excellent and solid release and will remain so for as long anyone keeps the bitrot in check, but to keep using it because porting is 'hard' is patent bs.https://www.mercurial-scm.org/repo/hg/log?rev=py3&revcount=2...
They've been porting hg into Python 3 for the last 10 years and are only now nearing completion.
I've written a bit more about this in Lobsters:
https://lobste.rs/s/3vkmm8/why_i_can_t_remove_python_2_from_...
The only real killer feature of Python3 is the async programming model. Unfortunately, the standard library version is numbingly complex. (Curio is far easier to follow, but doesn't appear to have a future.)
On the down side, switching to Unicode strings is a major hurdle. It mostly "just works", but when it doesn't, it can be difficult to see what's going on. Probably most programmers don't really understand all of the ins and outs. And on top of that, you get weird bugs like this one, which apparently is simply never going to be fixed.
The model is similar to Golang in many ways, e.g. communication using channels [2] and cancellation [3] reminiscent of context.WithTimeout, except that in Golang you need to reify the context passing.
The author has written some insightful commentary on designing async runtimes [4] and is actively developing the library, so I'm optimistic about its future. There were plans to use it for requests v3 until the fundraiser fiasco [5].
[0] https://github.com/python-trio/trio
[1] https://vorpus.org/blog/announcing-trio/
[2] https://trio.readthedocs.io/en/stable/reference-core.html#us...
[3] https://trio.readthedocs.io/en/latest/reference-core.html#ca...
[4] https://vorpus.org/blog/notes-on-structured-concurrency-or-g...
[5] https://vorpus.org/blog/why-im-not-collaborating-with-kennet...
The link to support requests (which is a great piece of software) is here:
https://cash.app/$KennethReitz
Note: This is NOT a charitable donation, it is a gift to an individual. These are not tax deductible under US law.
Njs has a long attacking blog post saying this needs to go through PSF (huh?) and that they should be getting most of this money not the person the funds were directed towards (it's not clear how much they've actually contributed to requests over time). This supposedly also may trigger folks who have also suffered from "gaslighting".
Supporting the developer of a piece of software does not, as far as I know, require that they sign up to handle it on a charitable basis. A big todo is made about the "large" amount raised. The amounts is 33K. To be frank, this is almost zero in tech land at least in the bay area and requests is a very highly used project. I was literally expecting something like 300K or even $1M - silly kickstarter projects raise for more and deliver nothing. Requests has already delivered a lot of utility.
Just a bit of perspective from someone who wasn't familiar with this "fiasco".
Dropbox invested three years of work, actually hired Python's creator, and are still not done. What are they getting out of it that they wouldn't have gotten if Python2 simply had been maintained?
Who wants to break old SQL? Nobody.
Yes, it's expensive to upgrade from Python 2 to Python 3, but it's also expensive for the Python project to maintain 2 versions of Python indefinitely. If someone wanted other than the core Python team wants to step up and maintain Python 2, they are free to do so, it's open source. But failing that, expecting the Python team to support the older/ less functional version of the code indefinitely is unrealistic. Corporate owned languages have even shorter lifecycles for exactly this reason.
I understand that this is one of the major features, but I personally never saw the appeal, given that gevent exists and in my experience works well most of the time. It also allows me to multiplex IO operations and doesn't rely on new syntax. I'm probably missing something?
- mandatory keyword arguments
- multi-dict splatting
- nicer yield semantics for generators
- Fixing system-specific encoding ambiguities
- dataclasses
- inline type annotations
- better metaclass support
- more introspection tooling
- pathlib (for nicer path handling)
- mocking pulled into the standard library in a cleaner way
- stable ABIs for extensions
- secrets handling
- ellipsis instead of pass (yeah who cares but I care)
- lots of standard lib API cleanup
All of this is very helpful for making clean applications. But I would say it's _very_ helpful for making good libraries as well. This stuff is about having a strong language foundation to avoid plain weirdness like the click issue .
Obviously it doesn't kill all of them, but there used to be even more of that kind of thing all the time. Library issues would basically get exported to its users, all basically due to language problems.
pd.read_excel(filepath) will read an entire dataset even if it contains unicode characters.
pd.ExcelFile() silently drops(!!) unicode rows. The resulting object will simply skip unicode-containing rows (in ANY column) them without even a warning.
For example, if you had an excel file:
word
---
"hello"
"hello"
你早
你早
"hello"
then pd.read_excel() would give you a dataframe with 5 rows. ExcelFile() on the other hand would return (silently!) a dataframe with only the first two and the last row.
Maybe this is a pandas issue, not a python issue, but it was really horrendous to debug for such a long time only to realize this was the issue.
I understand why it's the way it is, but when it comes the the typical unixy things I need to do shuffling of files around, tar'ing stuff, etc, it definitely trips me up more than I'd wish.
Migrating from Python 2 to Python 3 is way worse than that -- code changes are required, and because Python is a dynamic language you may not notice bugs until you actually run the code (or even worse, until after you release it to production and some code branch that is rarely invoked somehow gets called...). In other words, the tooling and the type system are not confidence-inspiring and it's really hard to verify that you migrated without breaking stuff.
At a certain point this sort of compatibility/forward motion of a codebase through big language revisions is something that has to be designed as part of the language in either being able to break it down into small enough chunks to chew through in pieces (updating a submodule with the updated language without affecting anything else), completely transparent to the code being run through it (this happens for compilers for C for different standards), or to have a version to version automated rewriting mechanism that is so reliable the outcome of the automated tool is not in question (tools like Go's gofmt). Python in my opinion only has partial solutions to all of those answers so it turns into a lot of hand work.
So while there are other languages that may do other things better there are still a class of programs that are very effective to write in Python, and that's plenty enough reason to keep it around. Do not forget that Python 2 was released in 2000 and Python 3 was released almost a decade later. The general time scale makes worrying about the next release for many people, but for people who do they start considering other languages because that's important to them.
Besides Java and Python already discussed, another big mess of a transition was from Qt 4 to Qt 5, where all the strings became unicode.
Early Python 3 was hell for conversion. The syntax was changed for no good reason. u'word" became illegal. (That later went back in.) The "2 to 3 converter" was a joke. I didn't have the "print statement problem" because my code called a logging function for all debug output.
Many of the P3 libraries didn't work. (The all-Python MySQL connector failed the first time I tried to do a bulk load bigger than a megabyte, indicating that nobody was using it.) It took years before the libraries were cleaned up.
Python 3 got some really weird features, such as type declarations that don't do anything. I can see having type declarations, especially for parameters, but they need to be used both for checking and optimization. CPython boxes everything, which is terrible for numerics and is why most serious math has to be done in C libraries. My comment on that was "Stop him before he kills again."
It did, but in a way that chainsaws support sculpting just fine. Technically possible. Very advanced people will know how to handle it. Everybody else is just going to injure themselves randomly.
Most people writing py2 got the text/binary processing working on accident. Things appear to work until you throw actual Unicode into parameters and then nobody knows what happens. There's a number of "what does this decoding exception mean" questions on stackoverflow every day. They're often actual bugs people could ignore before. Now they're told immediately and I believe that's better.
It didn't help that Py2's IDLE had (have? I didn't recall they actually resolved this, simply closed the issue) a major bug [1] that even if you explicitly use u-literals (a = u'日本語'), it will still be encoded in your locale (shift_JIS [2] in Japanese case), instead of unicode/utf-8. You can imagine how confused people would get when they were testing unicode support of py2 in IDLE and saw this.
Or so you wish, it's not necessarily true though. It's just as likely to pass through gibberish without blowing up.
I have a tiny relay service written in Django that lets me pass messages between my phone and home computer, that I recently upgraded both the python and Django version. The service is only two views of about 3 lines each - and a unicode conversion bug crept in such that it stored "b'text'" in the database instead of "text". No warnings, no errors.
Some are quite good and finely detailed in my opinion. It's really nothing like what you'd expect after hearing chainsaw. There actually are small chainsaw, maybe even one-handed ones, to do that.
Case in point, I worked in a project using Ruby. When we migrated from Ruby 2.4.0 to 2.4.6 (yeah, a minor upgrade), it broke spectacularly. Trying multiple Ruby versions, the change was actually introduced in Ruby 2.4.1. After some investigation, a change in Net::HTTP library from stdlib had a change that broke a dependency from a dependency. The fix was just a line of code (we just need to change the adapter used for HTTP communication), however it was two days of work for a minor upgrade.
My current job tried to migrate from Java 8 to Java 11. It also broke multiple services. This one is still in progress, months later.
Python 2 to Python 3 is bigger than both of those version changes (however it is equivalent to Ruby 1.8 to 1.9 changes), so yeah, it does take more time. And like some projects that are forever running Ruby 1.8 or Java 8 (or even worse, Java 6), we will have projects forever running Python 2 too.
According to my highly unscientific survey of the packages in Gentoo's package repo, there are roughly:
- 2500 packages that work with Python 2 or 3
- 1350 packages that work with Python 2 only
- 350 that work with Python 3 only
My methodology:
- 3122 Python 3 only
- 88 Dual support
- 8 Py2 leaf (standalone packages; may be dropped)
- 77 Not ported (will be dropped unless ported)
- 100 Blocked (require 1 or more "not ported" packages)
- 18 Legacy (will be dropped)
note that py3only/dualsupport only reflects how it is packaged in fedora, not what upstream provides.
For the same reason why migration to IPv6 is taking so long.
Both technologies don't solve immediate problems end users are facing. Instead they solve 'nice to fix' problems that few people care about.
I work in an industry where there is basically one 800lb gorilla of a vendor. They update rarely, because their product is a mission-critical, life-or-death sort of thing. Their current product is heavily, heavily integrated with x.y.z version of software from a different vendor in a different segment, but also weighing in at 800lb. Yes, they specify x.y.z, not just x or even x.y. That software comes bundled with a Python 2.7.5 distribution.
Imagine my woes trying to get pip running, which unhelpfully suggests I upgrade Python. Cannot seem to find any other path to even get pip going because of what I call the "lol just upgrade n00b" factor. Perhaps that information once existed but I cannot find it.
So, I am stuck on this version because of some pretty tight integration, at a couple of removes. I think the vendor-linkage can cause some "drag" that folks who work in a greenfield environment might not be thinking about. It can be unfortunate but there it is.
If it can help you. The trick I use is to install a normal python 2.7 interpreter with pip. Then you can use it to install software to any directory, including the one from the other application. There are flags to specify what to install, from where to where, internet or not, something like
pip install packagename --target=/to/app/libConsidering all the stuff that is written in Py2 I really don't see it being out and out abandoned. That wouldn't really make any sense. With computer languages stuff never goes away.
$ python3
Python 3.7.4 (default, Sep 7 2019, 18:27:02)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>>
[2]+ Stopped python3
$ lsof -c Python | sed -n "/fortran/s/$USER/<redacted>/gp"
Python 35190 <redacted> txt REG 1,4 1550456 12887664541 /Users/<redacted>/Library/Python/3.7/lib/python/site-packages/numpy/.dylibs/libgfortran.3.dylibIt was good in its time, and great things were done in it that are still around... but let's move onto F90 already.
I say this in part because comedy, but also because it was anticipated to be a long project. It was originally called "Python 3000".
And you can't do it gradually, so it's all-or-nothing. (yes, "six" exists, but you still execute one way or another)
And you'll have to change the versions of all your libraries, which is not usually a smooth experience in the Python ecosystem. (this is another place where it's "all or nothing", since six can't help you if your dependencies don't all use it + use it correctly)
---
It's a huge risk with huge cost for already-working, running code. For new stuff, sure, write it in 3, but 2.7 works fine and has the added benefit of being very well understood by this point.
I still haven't forgiven them for killing the print statement, which could have peacefully coexisted with a print() function.
The migration is financially negative in the short term, and very clearly so. It might be financially positive over the long term (due to easier maintenance and higher performance), but that is definitely maybe. Especially for an app that is otherwise very stable.
If you have a hole it's hard to dig yourself out of it. This is why I prefer modular apps instead of monolithic codebases. You can upgrade piece by piece. Otherwise it's all or nothing and dangerous
I think it is true (as of pretty recently) that Red Hat is the only company employing a Python core dev to work on Python core dev stuff full time (see https://discuss.python.org/t/official-list-of-core-developer...). But the core dev team is focused on Python 3, so that isn't a sign of Red Hat's Python 2 commitment either.
That means they'll patch Python 2 should vulnerabilities be found on their OS.
https://en.wikipedia.org/wiki/Red_Hat_Enterprise_Linux#Versi...
RHEL 6/7 and Centos 6/7 will support Python 2 until at least mid-2024.
However, barring speed improvements, there isn't much to offer, apart from unicode, f strings and annotations.
If python 3 had proper multithreading, that might have been worth breaking backwards compatibility for.
I have a lot of Python 2.7 code that I wrote years ago which has been running smoothly and my team is generally going to rewrite rather than "convert" because I really don't trust conversions. I'd rather see all bugs upfront rather than hidden in the fog.
A lot of my code is performance critical, and, for example, I'm still salty about dictionary operations taking O(log(n)). But the proliferation of active minor versions makes it very difficult to write portable, performant code.
It's become a sticky wicket. I want to migrate to Python 3 (and, by and large, I have in most of my projects). But what version do I target? Will my dependencies make the same choice? Or does "migration" turn into a sisyphean task? It's becoming burdensome enough that I'm contemplating abandoning the language for something more stable.
Current version is 3.7. If you expect your migration work to take a year, you should consider going for 3.7 and above only, because the previous minor versions will be dropped by the time you're done.
And fwiw "3.7 is the current version" doesn't help my users.
Migration in interpreted languages that implement major breaking changes is really tedious.
That’s the reason I am so upset with today’s JavaScript ecosystem - things move so fast that good technology is being deprecated and changed constantly which breaks all kinds of things in other places.
How can we expect Python 3 to become the default if Python 2 still asserts such dominance?
In my archlinux installation, python resolves to 3, and I have to use python2 if I want 2
I've been meaning to dig into Maya, Houdini, Nuke's Python 3 transition plans. I know Houdini will offer a Python 3 option with Houdini 18 (shipping in the next month or so).
I don't think the reason was because of downstream users. Python 3 was an inevitable change. Previously, they swapped out PyQt for PySide which wasn't a forced change, but required everyone to update their Python scripts.
So much effort wasted doing this in a large codebase. And what do you get for it? It’s just not worth it. Nobody actually needs Python 3, it was foisted on them by the developers. What everyone really wanted was Python 2.8.
I think many people underestimate the challenge that the 2 to 3 migration presents for large enterprises. The core issue is that even though the migration for any given module is normally really easy, the total effort required to migrate is still essentially O(n) in module count/file count, because even with current tooling you still need to have an engineer look at every module to do the change safely. Even if it only takes ~5 minutes per module to make the changes and validate that it works correctly, this becomes a giant undertaking when you have tens of thousands of files to migrate.
The fact that it takes a long time also creates other problems. Your business isn't going to hit "pause" on other development, so there will be changes constantly introduced into modules you've already "swept". It's going to be hard to make sure 100% of your engineers and code reviewers are knowledgeable about the specific requirements to make sure the code works in both 2 and 3, so you would really like some automated safeguards to make sure they don't introduce anything that won't work in 3. Pylint helps with this, but won't catch everything. Unit tests are obviously essential, but:
1. Even a well-tested project won't have tests that cover 100% of code paths and behavior.
2. You're stuck running the tests on both python2 and python3 for the duration of the migration, which doubles the resource (compute, memory, etc.) cost of your Python CI and regression testing infrastructure for the duration of the migration.
Most big companies have passionate Python advocates who really want to be on Python 3, but the scale of the problem and the lack of tooling to tackle it with a sub-O(n) amount of effort make the overall project risky and expensive for the business.
The unicode switch is a nightmare in terms of having to go through and double/triple check everything and still get it wrong half the time. Particularly when it comes to moving data over the network.
The big selling point for Python3 finally came with the built-in async support, but we've been using Twisted for a decade, which works nearly identically, so even that wasn't a huge draw for us.
Further, many of our dependencies were python2-only up until the last year or two.
Really the only reason we're going through the effort right now is that Python2 is rapidly approaching End of life.
This doesn't require parallel testing. These all improve the quality of 2.x code even if you never make the leap to 3.x.
Once this is done you can use 2to3 to mechanically fix the remaining differences. Anything else that remains broken can be special-cased in the 2.7 code until 2to3 works without intervention.
That's why six and manual changes are always needed...
It's a great comment otherwise.
The problem we have where I work is some very clever 2.7 code that isn't easy to redo in Python 3. For any new project I do, I use Python 3.
No, it could not. Python itself is a C executable, which makes the distinction moot.
For some reason, a lot of people seem to be laboring under the impression that Python 2 code is just going to stop working in 2020. The only thing stopping is the Python core team's bug-fix releases. Python 2 itself will continue to exist. Existing installations will keep working. Linux distributions _can_ choose to keep Python 2 in their repositories and maintain it separately going forward, although they are not likely to. Ubuntu, Red Hat, and other OS providers all have operating systems which include Python 2 that they are contractually obligated to support and patch for years in the future. And of course, the source code for Python 2 will never just up and disappear within our lifetimes unless human civilization does as well.
As for businesses, if your application is mission-critical and you want to keep it going, then you get to decide whether to invest in keeping your application current with the state of the art, or invest in keeping the application's environment static. This means having a reliable source of the required hardware, archived copies of the OS, all dependencies and libraries, and the application itself. And presumably you still need someone knowledgeable enough to fix bugs in the stack from time to time.
EDIT: Personally for me while I find for example older windows interfaces ugly they were very consistent and functional. In modern designs I sometimes could hardly find what is clickable/actionable. It is not interface working for me but the other way around
No, users most definitely do not care about Material Design. They only care about being able to quickly do the task the app or web site claims to allow them to do.