undefined | Better HN

0 pointsmaleldil3y ago0 comments

That's a useless use of cat. You can use `jq . foo.json | pbcopy` or `jq < foo.json | pbcopy`.

0 comments

26 comments · 5 top-level

nicky03y ago· 8 in thread

In what way do you see those alternatives as superior?

It usually doesn't matter much, but there are some situations where it can matter a lot. For one thing, you can't use seek() on a pipe, so e.g. `cat bigfile | tail` has to read through the entire file to find the end, but `tail bigfile` will read the file backward from the end, completely skipping the irrelevant beginning and middle. With `pv bigfile | whatever`, pv (which is basically a pipeline progress indicator) can tell how big file is and tell you how for through you are as a percentage; with `cat bigfile | pv | whatever`, it has no idea (unless you add a flag to tell it). Also, `cat bigfile | head` will end up killing cat with a SIGPIPE signal after head exits; if you're using something like "Unofficial bash strict mode" [1], this will cause your script to exit prematurely.

Another sometimes-important difference is that if there are multiple input files, `somecommand file1 file2 file3` can tell what data is coming from which file; with `cat file1 file2 file3 | somecommand` they're all mashed together, and the program has no idea what's coming from where.

In general, though, I think it's mostly a matter of people's expertise level in using the shell. If you're a beginner, it makes sense to learn one very general way to do things (`cat |`), and use it everywhere. But as you gain expertise, you learn other ways of doing it, and will choose the best method for each specific situation. While `cat |` is usually an ok method to read from a file, it's almost never the best method, so expert shell users will almost never use it.

[1] http://redsymbol.net/articles/unofficial-bash-strict-mode/

derefr3y ago

If the command is meant to stream through something really fast by using a large buffer size, then prepending a cat(1) will limit the incoming buffer size to ~4k.

vram223y ago

Interesting.

Maybe use dd with one of its blocksize options, then?

Not at a terminal, can't check.

paulddraper3y ago

They avoid an unnecessary invocation of the cat executable.

Instead, they open a file descriptor and pass that.

Tiny difference but there you go.

wpm3y ago

I teach shell scripting. Cat invocations are cheap and help learners understand and keep clear where input is coming from, and where it is going. There are no awards or benefits to reducing the number of lines, commands invoked, or finding the shortest possible way to perform a task in a script. There are plenty of detriments to reading and understanding though when we try to obfuscate this to save 1ms of execution time on a script that is going to execute near instantaneously anyways.

In short, I straight up don't care.

1 more reply

adrianmonk3y ago

Not just that, but also all the bytes have to go through an extra pipe. Presumably they're copied an extra time because of this.

When you run "cmd < file", the command reads from stdin, which pulls directly from the file. When you do "cat file | cmd", "cat" opens the file, reads from there, and writes to a pipe. Then "cmd" reads from its stdin, which is a pipe.

1 more reply

omginternets3y ago

>They avoid an unnecessary invocation of the cat executable.

And ... ?

latexr3y ago

To add, searching for “useless use of cat” will yield several results for those interested in learning more. Other examples include “useless use of echo” and “useless use of ls *”.

1 more reply

jdbartee3y ago· 7 in thread

Speaking for myself, the first form is more natural- even if it’s a useless cat, because I’m always cat-ing files to see their structure. Then progressively tacking on different transforms. And then finally putting it in whatever I want as output.

It’s so ingrained, I’m more likely than not to just write it out that way even when I know exactly what I’m doing from the onset.

jonnycomputer3y ago

Yes, this iterative procedure is often why "useless" cats get put into it. It's a very effective way of processing regular text information.

e.g.

I need to grab some info from textfile.txt to use as arguments to a function.

cat textfile.txt

looks like its comma delimited.

cat textfile.txt | cut -d, -f 2-5

ah, its the third and fourth column i need

cat textfile.txt | cut -d, -f 3-4 | grep '123456'

perfect

cat textfile.txt | cut -d, -f 3-4 | grep 123456 | tr , ' '

myfunc $(cat textfile.txt | cut -d, -f 3-4 | grep 123456 | tr , ' ')

gumby3y ago

> cat textfile.txt

> looks like its comma delimited.

Interesting; why wouldn't you use `head`? Who knows how big textfile.txt is?

4 more replies

jamespullar3y ago

I've been using bat as a cat replacement for a while now. It includes paging, syntax highlighting, line numbers, and is generally very performant.

https://github.com/sharkdp/bat

fastaguy883y ago

As a scientist who cares about reproducibility, the big difference between the "useless cat" and providing the input file name on the command line is that, in the latter case, the program can capture that file name and reproduce it. That is harder when using stdin.

Many of my programs and scripts start output with the line: # cmd arg1 arg2 arg3 ...

and simply echo back lines that start with '#'. That way, I have an internal record of the program that was run and the data file that was read (as well as previous parts of the analysis chain).

And, 'R' ignores lines starting with '#', so the record is there, but does not affect later analyses.

paulddraper3y ago

You could consider

    < foo.json jq | pbcopy

patrec3y ago

If you're using zsh, you can just replace any instance of

    $ cat somefile ...

with

    $ <somefile ...

For bash, this only works if you have at least one `|`.

ddingus3y ago

I did this last time I saw it come up and was surprised! Doing it makes perfect sense in hindsight. Neato!

nojs3y ago· 4 in thread

The “useless cat” meme needs to die. Everyone is aware that most commands accept a file argument, but looking up the arguments and their ordering is annoying and using cat for things like this is just fine.

epcoa3y ago

The redirect always works though - that is not a program argument, that is handled by the shell. Apparently not everyone is aware of that.

burnished3y ago

Everyone is not aware, new people are joining all the time.

hdb23y ago

granted, it is a little snarky and maybe the snark isn't appropriate in today's tech environment. but no, things like "useless use of cat" do not need to go away, because they make me better at what I do in little ways. those little ways add up over time.

> but looking up the arguments and their ordering is annoying

you seem to be arguing for complacency. taking your idea to an extreme, why learn to do _anything_ well?

omginternets3y ago

This. "Useless cat" is more useful than "useless file-arg".

Someone3y ago· 2 in thread

Is there any shell that has cat as a built-in?

Such a shell could remove some of the more common cases.

hnlmorg3y ago

All of them do. Including bash. It’s just not the same syntax (ie ‘< filename’).

But I honestly think people who try to optimise away ‘cat’ are optimising the wrong thing. If one extra fork() is that detrimental then don’t use a shell scripting language.

For a lot of people, “useless” ‘cat’ enables them to write a pipeline in the order that their brain farts out the requirements for the pipeline. So they’ve optimised for human productivity. And given the human brain is slower than a few extra fork()s, I think optimising for one’s brain makes more sense here.

Someone3y ago

> All of them do. Including bash.

Are you sure? https://unix.stackexchange.com/questions/208615/is-cat-a-she... disagrees and neither https://manpages.ubuntu.com/manpages/jammy/man7/bash-builtin... nor https://zsh.sourceforge.io/Doc/Release/Shell-Builtin-Command... mention it

1 more reply

cratermoon3y ago

https://porkmail.org/era/unix/award

j / k navigate · click thread line to collapse

0 comments

26 comments · 5 top-level

nicky03y ago· 8 in thread

In what way do you see those alternatives as superior?

gdavisson3y ago

[1] http://redsymbol.net/articles/unofficial-bash-strict-mode/

derefr3y ago

If the command is meant to stream through something really fast by using a large buffer size, then prepending a cat(1) will limit the incoming buffer size to ~4k.

vram223y ago

Interesting.

Maybe use dd with one of its blocksize options, then?

Not at a terminal, can't check.

paulddraper3y ago

They avoid an unnecessary invocation of the cat executable.

Instead, they open a file descriptor and pass that.

Tiny difference but there you go.

wpm3y ago

In short, I straight up don't care.

1 more reply

adrianmonk3y ago

Not just that, but also all the bytes have to go through an extra pipe. Presumably they're copied an extra time because of this.

1 more reply

omginternets3y ago

>They avoid an unnecessary invocation of the cat executable.

And ... ?

latexr3y ago

To add, searching for “useless use of cat” will yield several results for those interested in learning more. Other examples include “useless use of echo” and “useless use of ls *”.

1 more reply

jdbartee3y ago· 7 in thread

It’s so ingrained, I’m more likely than not to just write it out that way even when I know exactly what I’m doing from the onset.

jonnycomputer3y ago

Yes, this iterative procedure is often why "useless" cats get put into it. It's a very effective way of processing regular text information.

e.g.

I need to grab some info from textfile.txt to use as arguments to a function.

cat textfile.txt

looks like its comma delimited.

cat textfile.txt | cut -d, -f 2-5

ah, its the third and fourth column i need

cat textfile.txt | cut -d, -f 3-4 | grep '123456'

perfect

cat textfile.txt | cut -d, -f 3-4 | grep 123456 | tr , ' '

myfunc $(cat textfile.txt | cut -d, -f 3-4 | grep 123456 | tr , ' ')

gumby3y ago

> cat textfile.txt

> looks like its comma delimited.

Interesting; why wouldn't you use `head`? Who knows how big textfile.txt is?

4 more replies

jamespullar3y ago

I've been using bat as a cat replacement for a while now. It includes paging, syntax highlighting, line numbers, and is generally very performant.

https://github.com/sharkdp/bat

fastaguy883y ago

Many of my programs and scripts start output with the line: # cmd arg1 arg2 arg3 ...

and simply echo back lines that start with '#'. That way, I have an internal record of the program that was run and the data file that was read (as well as previous parts of the analysis chain).

And, 'R' ignores lines starting with '#', so the record is there, but does not affect later analyses.

paulddraper3y ago

You could consider

    < foo.json jq | pbcopy

patrec3y ago

If you're using zsh, you can just replace any instance of

    $ cat somefile ...

with

    $ <somefile ...

For bash, this only works if you have at least one `|`.

ddingus3y ago

I did this last time I saw it come up and was surprised! Doing it makes perfect sense in hindsight. Neato!

nojs3y ago· 4 in thread

epcoa3y ago

The redirect always works though - that is not a program argument, that is handled by the shell. Apparently not everyone is aware of that.

burnished3y ago

Everyone is not aware, new people are joining all the time.

hdb23y ago

> but looking up the arguments and their ordering is annoying

you seem to be arguing for complacency. taking your idea to an extreme, why learn to do _anything_ well?

omginternets3y ago

This. "Useless cat" is more useful than "useless file-arg".

Someone3y ago· 2 in thread

Is there any shell that has cat as a built-in?

Such a shell could remove some of the more common cases.

hnlmorg3y ago

All of them do. Including bash. It’s just not the same syntax (ie ‘< filename’).

But I honestly think people who try to optimise away ‘cat’ are optimising the wrong thing. If one extra fork() is that detrimental then don’t use a shell scripting language.

Someone3y ago

> All of them do. Including bash.

1 more reply

cratermoon3y ago

https://porkmail.org/era/unix/award

j / k navigate · click thread line to collapse