comment: https://news.ycombinator.com/item?id=26305052
comment: https://news.ycombinator.com/item?id=39679662
"ASCII Delimited Text – Not CSV or Tab Delimited Text" post [2014]: https://news.ycombinator.com/item?id=7474600
same post [2024]: https://news.ycombinator.com/item?id=42100499
comment: https://news.ycombinator.com/item?id=15440801
(...and many more.) "This comes up every single time someone mentions CSV. Without fail." - top reply from burntsushi in that last link, and it remains as true today as in 2017 :D
You're not wrong though, we just need some major text editor to get the ball rolling and start making some attempts to understand these characters, and the rest will follow suit. We're kinda stuck at a local optimum which is clearly not ideal but also not troublesome enough to easily drum up wide support for ADSV (ASCII Delimiter Separated Values).
Many text editors offer extensions APIs, including Vim, Emacs, Notepad++. But the ideal behavior would be to auto-align record separators and treat unit separators as a special kind of newline. That would allow the file to actually look like a table within the text editor. Input record separator as shift+space and unit separator as shift+enter.
1. the field separator to be shown as a special character
2. the row separator to be (optionally) be interpreted as a linefeed
IIRC 1) is true for Notepad++, but not 2).
The characters you mention could be used in a custom delimiter variant of the format, but at that point it's back to a binary machine format.
This, of course, assumes that your input doesn’t include tabs or newlines… because then you’re still stuck with the same problem, just with a different delimiter.
Except the usages of that character will be rare and so potentially way more scary. At least with quotes and commas, the breakages are everywhere so you confront them sooner rather than later.
/s in case :)
You need 4 new keyboard shortcuts. Use ctrl+, ctrl+. ctrl+[ ctrl+] You need 4 new character symbols. You need a bit of new formatting rules. Pretty much page breaks decorated with the new symbols. It's really not that hard.
But, like many problems in tech, the popular advice is "Everyone recognizes the problem and the solution. But, the problematic way is already widely used and the solution is not. Therefore everyone doing anything new should invest in continuing to support the problem forever."
But when looking for a picture to back up my (likely flawed) memory, Google helpfully told me that you can get a record separator character by hitting Ctrl-^ (caret). Who knew?
...There would be datasets that include those characters, and so they wouldn't be as useful for record separators. Look into your heart and know it to be true.
It requires bespoke tools for edition, and while CSV is absolute garbage it can be ingested and produced by most spreadsheet software, as well as databases.