Yeah, and ideally you want the backward compatibility, so we don't have to recompile the world or patch things like e.g. cat.
But yeah, the root of the problem is that a) the TUI-like application that manually drives the terminal, with cursor movement/line folding/redrawing etc. needs to know, at every single moment, the precise location of the cursor and the coordinates of every single character it outputted (to e.g. properly handle backspacing, including over the line folds: \b explicitly doesn't move to the previous line by default and very wide and old convention), and b) getting that coordination info from the terminal in quick, reliable, and side-effect free manner is impossible, so you have to guess.
It could just be the path to a Unix domain socket in an environment variable, where that socket speaks some kind of RPC protocol
If getting the current cursor position in the terminal were as easy (and as fast) as calling GetConsoleScreenBufferInfo, instead of sending "\e[6n" to the terminal and then reading input from it and looking for "\e[(\d+);(\d+)R" inside and then somehow unreading the input before that device report, yeah, that'd be nice and would allow solving a lot of problems with mere brute force. Sadly, it is not, and so most reimplementations of e.g. readline/linenoise functionality in shells and prompts (Erlang's shell went through 2 internal reimplementations just in my time using it, for example) are flying blind, hoping that their implementation of wcwidth(3) matches the terminal's one.
[0] https://learn.microsoft.com/en-us/windows/console/console-fu...
EDIT: Without saying that I think this is worthy and cool. I am just curious about the costs and benefits of such a tool.
If you speak the languages that use those scripts? Then all the time, I imagine. The support for double-char width cells in the terminals started to appear all the way back in the late seventies because Japan, you know, existed and kinda mattered.
All the languages I am literate in use an alphabet and I have never encountered a script in anything other alphabetic scripts in the terminal, and never anything not in English for serious work.
I would think we would probably have far fewer characters with hard to determine widths being printed in terminal (before LLMs) as most of it would be rendered in the GUI, which state of the art terminal emulators somewhat rely on anyway.
My guess is that LLMs made translation for these sorts of tools much easier (just needing someone fluent in both languages to verify rather than translate from scratch) but that's why I am asking. Is it more common now than ever before?
Beyond that the examples given were of scripts that are widely used in India which is a country with the world's largest English speaking population and one of the world's most spoken English dialects and also a huge IT sector.
I get that CJK has an existing double width carve out, that is being proposed to be kept by the objection linked in the article.
- Everybody just uses english text, right?
- Ok, sometimes there might some weird accents or something
- Every character is about the same width
- Well, they're all integer numbers of characters wide
- No character is taller than an english I
- Everybody writes left to right
- Everyone writes horizontally
Also https://jeremyhussell.blogspot.com/2017/11/falsehoods-progra...
EDIT: How the hell do you format lists in HN comments
It can't handle terminal window resize and the layout gets messed up
I was surprised to see node based cli work much better with resize?
Anyone knows why?
Most products in terminal these days use widget tree or virtual dom.
Things like aider use prompt toolkit and loses the layout when you resize window. Because the screen is printed on each change with diffing, upon resize there is no issue when you use stuff like ink or textual library in python.