undefined | Better HN

0 pointsrtpg9y ago0 comments

Surely you can reconcile structured representations and something like the Unix command line.

Imagine if the default wasn't bash, but something like Ruby + pipes (or some other terse language).

What is the argument for shell scripts not working on typed objects? How much time has been lost, how many bugs have been created because every single interaction between shell scripts has to include its own parser. How many versions of " get file created timestamp from ls" do we need?

Something Windows does get right is the clipboard. You place a thing on the clipboard, and when you paste, the receiving program can decide on the best representation. This is why copy-pasting images has worked so magically.

I could see an alternative system where such a mechanism exists for shell programs.

0 comments

16 comments · 8 top-level

chei0iaV9y ago· 3 in thread

The problem is that now simply providing user-friendly output is not enough. For every program or script you throw together you need to provide the text output for the user, and then the type object stream for piping. And then, the user would need to read documentation to see how to access each piece of data, what data type it is, what data type the other command takes, and maybe consider how to convert one to the other.

... at which point you basically have a scripting language, so you could just as well use an existing one (e.g. Ruby).

emodendroket9y ago

> The problem is that now simply providing user-friendly output is not enough. For every program or script you throw together you need to provide the text output for the user, and then the type object stream for piping.

In PowerShell if the result of a command is just an object the object is pretty-printed to the console in practice ends up looking pretty much like what a Unix command would have given you.

chei0iaV9y ago

With generic pretty-printing, your program output becomes generic.

Compare the output of "df -h" vs the PowerShell equivalent "gdr -psprovider filesystem", for example. One provides the data in dense (easy to follow) rows, while the other spaces it out across the whole screen, leaving large gaps of empty space around some columns while also cutting off data in others. The difference is especially noticeable of you have network shares with long paths.

PowerShell is probably nice for scripting, but I wouldn't want to have it as my shell.

2 more replies

michaelcampbell9y ago

> in practice ends up looking pretty much like what a Unix command would have given you.

So... no real improvement, then?

1 more reply

pjungwir9y ago· 2 in thread

Wow, the clipboard is really a thought-provoking comparison. I'm not sure if many people are quite aware of what you said, unless they've done desktop programming: when an application puts something on the clipboard, it can put multiple formats, so that when something else wants to retrieve it, it can use whichever format it prefers. This is how you get such good copy/paste interoperability between programs.

What if pipes worked the same way? What if we added stdout++, stderr++, and stdin++, and when you write to stdout/err++, you can say which format you're writing to, and you can write as many formats as you like. And then you can query stdin++ for which formats are available, and read whichever you like. And if stdin++ is empty, you could even automatically offer it with a single "text" format, that is just stdin(legacy).

The appeal of the Unix text-based approach is a kind of "worse is better". It is so simple and easy, compared to Powershell. The clipboard idea seems like it has a similarly low barrier to entry, and is even kind of backwards-compatible. It seems like something you could add gradually to existing tools, which would solve the marketplace-like chicken-and-egg problem.

You could even start to add new bash syntax, e.g. `structify my.log || filter ip_addr || sort response_size`. (Too bad that `||` already means something else....) Someone should write a thesis about this! :-)

buzzybee9y ago

Didn't the Amiga do something like this? (I'm not actually familiar with its OS, I've just seen allusions to how it handled file formats)

anexprogrammer9y ago

Amiga had datatypes.library, and also IFF filetypes (that were mostly lovely) designed be EA for DPaint. It's the way the world should work.

Let's say you're MS writing word for the Amiga.

They provide a datatypes description for doc files. this gives ability to read and write the format, and fingerprint it (not based on extension).

Now any program, old or new, that wants to read or write doc files can do. It's just there.

romaniv9y ago· 2 in thread

>What is the argument for shell scripts not working on typed objects?

Typed objects can make it harder to pipe commands together. How do you grep a tree when tree is an actual data structure and grep expects a list of items as input? You would need to have converters. Either specific converter between tree and list, or a generic one: tree->text->list.

>Something Windows does get right is the clipboard.

It useful, but the actual implementation is pretty bad. Opaque, prone to security issues, holds only single item, cannot be automated.

JadeNB9y ago

> Typed objects can make it harder to pipe commands together. How do you grep a tree when tree is an actual data structure and grep expects a list of items as input? You would need to have converters. Either specific converter between tree and list, or a generic one: tree->text->list.

To be fair, untyped objects also require converters, but at every boundary. That is, instead of having some pipes of the form `program -> mutually agreeable data structure -> program` and some pipes of the form `program -> unacceptable data structure -> parser -> program` (as happens with a typed language), you are guaranteed by a text-based interface always to have pipes of the form `program -> deparser -> text -> parser -> program`.

rtpgOP9y ago

grepping a tree would not be hard. You simply go down to the leaves and "grep" on the properties themselves. Inversely, structured data lets you easily get to a property.

For example, if you grep the output ls -l for a file named "1", you'll also get files with 1 in their timestamp. In text land, you have to edit the ls command to get simpler output. In structured land, you could edit your filter: ls -l | grep name~=1

You could imagine various structured data filter tools that could be built that wouldn't require modifying the input.

Though in this example, you can easily use awk to select the column, wouldn't it be nice to not have to worry about things like escape characters when parsing text?

cyphar9y ago· 1 in thread

> What is the argument for shell scripts not working on typed objects? How much time has been lost, how many bugs have been created because every single interaction between shell scripts has to include its own parser. How many versions of " get file created timestamp from ls" do we need?

Aside: that's what the stat command is for. My big concern with types is how would you make sure that the output of a command will always have the right types? Otherwise you'll have runtime type errors which would be just as bad as runtime parsing errors.

junke9y ago

> Otherwise you'll have runtime type errors which would be just as bad as runtime parsing errors.

Parsing errors in script shells can easily go unnoticed (until you realize your data is corrupted).

patmcguire9y ago

How about protocol buffers and content negotiation? Your pipe figures out a proto that program a can emit and program b can consume.

ls -> repeated FileDescriptor files; | repeated string names;

Where FileDescriptor is whatever it needs to be, has all the info ls -l does. You have a heirarchy of outputs: if the next takes FileDescriptors, you give it FileDescriptors, if it doesn't you give it strings.

What would go to stdout goes through a toString filter.

Kalium9y ago

You are absolutely, completely correct. The two can certainly be reconciled!

There is one possible complication, though. The two would need to be reconciled in the same way by everyone who wants to write a shell tool. Given that even fairly simple standards (RSS, HTML, etc.) cause lots of failures to comply, what are the odds of near-universal compliance in a larger and more diverse ecosystem like shell utilities?

fulafel9y ago

Image processing works pretty well in shell pipelines because image formats are generally self-identifying.

tejtm9y ago

> How many versions of " get file created timestamp from ls" do we need?

none? 'ls' is for humans. 'stat' is for scripts.

j / k navigate · click thread line to collapse