Now with Python 3 it's the other way around -- Windows is fine but Linux has issues. However, the problems for linux are less severe: often you can get away with assuming that everything is UTF-8. And you can still work with bytes if you absolutely need to.
No, they're not. Windows can't magically send your program Unicode. It sends your program strings of bytes, which your program interprets as Unicode with the UTF-16 encoding. The actual raw data your program is being sent by Windows is still strings of bytes.
> you can still work with bytes if you absolutely need to
In your own code, yes, you can, but you can't tell the Standard Library to treat sys.std{in|out|err} as bytes, or fix their encodings (at least, not until Python 3.7, when you can do the latter), when it incorrectly detects the encoding of whatever Unicode the system is sending/receiving to/from them.
> AFAIK neither sys.argv nor sys.environ had a unicode-supporting alternative in Python 2)
That's because none was needed. You got strings of bytes and you could decode them to whatever you wanted, if you knew the encoding and wanted to work with them as Unicode. That's exactly what a language/library should do when it can't rely on a particular encoding or on detecting the encoding--work with the lowest common denominator, which is strings of bytes.
Actually you can, you should use sys.std{in,out,err}.buffer, which will be binary[1]
> or fix their encodings (at least, not until Python 3.7, when you can do the latter), when it incorrectly detects the encoding of whatever Unicode the system is sending/receiving to/from them.
I'm assuming you're talking about scenario where LANG/LC_* was not defined, then Python assumed us-ascii encoding. I think in 3.7 they changed default to UTF-8.
(IIRC it used to do that in 3.0, but they backtracked very quickly - and 3.0 was effectively treated as a beta by the community at large, anyway.)