* NUL terminated strings (and now, non UTF-8 encoded strings on input/output)
* Using LF or CR or CRLF as line terminators, and pipe/comma-delimited fields when there were other unambiguous ASCII characters that could have been used (eg, GS, FS, RS) that would have made the encoding/decoding of line termination an I/O thing keeping HT/VT/CR/LF/FF as literally print related codes.
UTF-8 on stdin/stdout works perfectly fine (unless you are on Windows of course, which is stuck in in the early 90s when it comes to international text encoding).
> Using LF or CR or CRLF as line terminators
This is also an operating system convention, and it would be better if programming languages wouldn't try to "guess" the correct line endings, since this causes more problems than it solves - but again, this is mostly a Windows specific problem, and it's Microsoft's job to finally bring Windows into the current century.
Unix used LF, Apple used CR, Microsoft used CRLF.
They are all ASCII carriage movement codes, which is about driving the paper feed and print head of an ASR-33 or equivalent.
So they all made the "wrong" decision about what to store in a file.
They just chose different wrong characters.
Apple hasn't been using CR since the release of OSX (26 years ago). Microsoft could have made the switch at any time too (just as they could have switched to UTF-8 as universal text encoding on Windows), they just choose not to.
In the end it's not the job of programming languages to clean up Microsoft's mess ;)
Why is it Microsoft's fault? They just stayed on their legacy implementation, Linux and Apple chose to move from the legacy implementation to another legacy implementation. That seems dumb.
Unix followed Multics. Multics chose right. ASCII/EMCA-6/ISO646 drafts discussed this at least as early as 1963¹: “For equipment which uses a single combination (called New Line) [...] NL will be coded at FE₂ [Field Effector 2 = 0x0A].”
¹ doi/10.1093/comjnl/7.3.197
I suppose you could document that it's unsupported, and just drop or reject such values, but then the system couldn't be used to handle test data for such systems, for example.
Last time I had to handle CSV files in bash, I converted them internally to RS and FS.
NL Next line (from EBCDIC?)
LS Line separator (invented by Unicode)
PS Paragraph separator (same)
The Unicode standard says that in addition to CR, LF, CRLF and the above, vertical tabs and form feeds should also be treated as line separators.
I would just use UTF-8 everywhere.