Another sometimes-important difference is that if there are multiple input files, `somecommand file1 file2 file3` can tell what data is coming from which file; with `cat file1 file2 file3 | somecommand` they're all mashed together, and the program has no idea what's coming from where.
In general, though, I think it's mostly a matter of people's expertise level in using the shell. If you're a beginner, it makes sense to learn one very general way to do things (`cat |`), and use it everywhere. But as you gain expertise, you learn other ways of doing it, and will choose the best method for each specific situation. While `cat |` is usually an ok method to read from a file, it's almost never the best method, so expert shell users will almost never use it.
[1] http://redsymbol.net/articles/unofficial-bash-strict-mode/
Maybe use dd with one of its blocksize options, then?
Not at a terminal, can't check.
Instead, they open a file descriptor and pass that.
Tiny difference but there you go.
In short, I straight up don't care.
When you run "cmd < file", the command reads from stdin, which pulls directly from the file. When you do "cat file | cmd", "cat" opens the file, reads from there, and writes to a pipe. Then "cmd" reads from its stdin, which is a pipe.
And ... ?
It’s so ingrained, I’m more likely than not to just write it out that way even when I know exactly what I’m doing from the onset.
e.g.
I need to grab some info from textfile.txt to use as arguments to a function.
cat textfile.txt
looks like its comma delimited.
cat textfile.txt | cut -d, -f 2-5
ah, its the third and fourth column i need
cat textfile.txt | cut -d, -f 3-4 | grep '123456'
perfect
cat textfile.txt | cut -d, -f 3-4 | grep 123456 | tr , ' '
myfunc $(cat textfile.txt | cut -d, -f 3-4 | grep 123456 | tr , ' ')
> looks like its comma delimited.
Interesting; why wouldn't you use `head`? Who knows how big textfile.txt is?
Many of my programs and scripts start output with the line: # cmd arg1 arg2 arg3 ...
and simply echo back lines that start with '#'. That way, I have an internal record of the program that was run and the data file that was read (as well as previous parts of the analysis chain).
And, 'R' ignores lines starting with '#', so the record is there, but does not affect later analyses.
< foo.json jq | pbcopy> but looking up the arguments and their ordering is annoying
you seem to be arguing for complacency. taking your idea to an extreme, why learn to do _anything_ well?
Such a shell could remove some of the more common cases.
But I honestly think people who try to optimise away ‘cat’ are optimising the wrong thing. If one extra fork() is that detrimental then don’t use a shell scripting language.
For a lot of people, “useless” ‘cat’ enables them to write a pipeline in the order that their brain farts out the requirements for the pipeline. So they’ve optimised for human productivity. And given the human brain is slower than a few extra fork()s, I think optimising for one’s brain makes more sense here.
Are you sure? https://unix.stackexchange.com/questions/208615/is-cat-a-she... disagrees and neither https://manpages.ubuntu.com/manpages/jammy/man7/bash-builtin... nor https://zsh.sourceforge.io/Doc/Release/Shell-Builtin-Command... mention it