Note that people who still do use complex solutions built from cat, head, cut, etc, and who know what they're doing, will typically either write a shell script (which won't be structured particularly differently from the equivalent Python or whatever) or will rely heavily on awk (itself a full-featured programming language, no easier to learn than any other scripting language), or both.
One-liners which pipe text between four or five different commands are the equivalent of hand-soldered boards or bitwise arithmetic. Interesting to learn about for historical reasons but of no practical utility.
The use of things like xargs and jq in this solution, difficult to invoke Unix utilities for doing things that are trivial in any reasonable language, makes this even more clear.
That's just, like, your opinion man...
> people who still do use complex solutions built from cat, head, cut, etc, and who know what they're doing, will typically either write a shell script or use awk
No, I still use cat, head, cut etc., because it's easier to see at each step what is happening and incrementally add to that, because they're literally everywhere, because it's quick, and because I like it. Not for major projects, granted, but why would I need to write a python file for something that takes a single line of piped commands?
Why is it not valid today?
> One-liners which pipe text between four or five different commands are the equivalent of hand-soldered boards or bitwise arithmetic.
Why is that bad? Don't deploy a supercomputer to do what a hand soldered board can. Keep it simple.
I don't see how you equate piping commands to bitwise arithmetic, but bitwise arithmetic is easy anyway.
> but of no practical utility
Says you. Just because you don't find something useful doesn't mean nobody else finds it useful.
Turning one liner piped commands to a program in what you might consider a "proper programming language" usually ends up turning a declarative program into something prodecural. Not that that's necessarily a bad thing. Just saying.
Use the right tool for the job. In some cases (not all) the shell is indeed the right tool.
I get a feeling you don't understand the Unix philosophy. Read The Art of Unix Programming by Eric S. Raymond. Go learn bitwise arithmetic.
The uneducated play with pictures. Educated people read and write :)
Have a great day (or night, depending on your timezone — night here).
These kind of comments aren't useful, and come off smug to me.
Why doesn't he understand something, what it the point in reading/learning something you think relevant? Why would they invest in your suggestions if you don't provide any reasons what is missing?
Also:
> The uneducated play with pictures. Educated people read and write :)
This sounds insulting to me (name-calling), and passive-aggressively so when combined with "Have a great day".
I don't understand - isn't bash shell procedural?
I think you actually confirmed my point when you said 'no sysadmin has time or interest'. I didn't say that no one uses shell anymore. I said that there was no practical justification for doing so. Lots of people still use it and don't have any interest in finding a better solution. These people are choosing, for their own reasons, to double down on an obsolete skillset. Good for them, I suppose, but I don't think that demand for their niche is going to be around for very long.
I also pointed out that sysadmins who know what they're doing use awk. Based on your mentioning awk it seems like you agree with this.
This is false. Every line they write has potential security implications. If any pattern can become a string injection vulnerability, it will. Even most real programmers do not understand shell scripting.
This discussion is all moot because UN*X is an obsolete misconception made by programmers who are too dumb to understand the difference between and significance of AST manipulation and string concatenation. It's not hard to understand what I'm talking about. Look how much ad-hoc, non-reusable trivia you have to learn to pass stuff around between find, ls, and xargs. Whenever you would do `x = f(); g(x)` in Python, you will spend 10 minutes figuring out if some given shell script is doing the equivalent the correct, safe, secure way.
I just realized today that every time I see some crap like x,abc%%20%%20def,y x,abc\x20\x20def,y I am actually depressed because I already know all the played out meta of such systems - it's a bunch of half working junk full of pointless vulns that only exist because between 0 and 10 of the said "experts at their craft" in the world actually bother to program this crap correctly. And I literaly have to squint to figure out where the bug is (there's always the bug in such code).
If the UN*X shell was replaced by Python, sysadmins would have no trouble adapting. Python is a terrible language, (plus this discussion is convoluted by the fact that Python's gimped from bending over to work well with UN*X) but still better than UN*X shell in every way.
Why use many word when few word do trick?
Shell scripts are very compact for what they do, and generally as clear as a similarly quick-and-dirty solution in another language. Ease of learning and use comes from having used the programs you are running in your script. E. g. my small script piping stuff from nmcli to fzf so I can choose a wifi network would be much more difficult to write in Python: I’d need to find a Python library for interacting with NetworkManager, and a library for interactively fuzzy searching in a list, read the docs, spend a while setting up a venv to run it, … I don’t have time for any of that.
xargs, in particular, is not difficult to invoke once you’ve done it once or twice, and does a lot. Apart from just being a loop, it also parallelizes execution, and can ask for confirmation from the user for each invocation. Implementing either of these features will take you more than 2-4 chars you need to use it with xargs.
I think the perfect very high level language is closer to shell than to python for example. The power of Tcl/Tk (still it has some big weaknesses) or Rebol/Red is something that I admire.
The following statement is probably controversial: shell is more akin to Lisp for human beings. I dabbled at Scheme, but it is harder for me to grasp than Shell, but in the end they are more similar than not.
I hold a candle for Oil shell for example.
The only downside of shell scripting is that it isn't trivially portable to Windows (or even macOS because of the differences between GNU and BSD tools), so it often makes sense to create big Python scripts that do more than "one thing right". If the whole world would run on UNIX, shell scripting would make much more sense.
The only downside? Let's add that no major *NIX shell that I'm aware of has any good way to modularize code while enforcing encapsulation of state.
At my previous job, we had a rule that any shell script longer than about a page of code had to be replaced with a Python script ASAP. That was a good rule, IMO, because once you've exceeded a certain size, a shell script starts getting brittle and hard to work with. I don't know if 1 page of code is the threshold size or not, but it seems like as good a cutoff as any.
Shell is portable in a way nothing else is, same reason people use Excel instead of code or PHP instead of literally anything.
Secondly the presence of all these utilities on a machine is far from guaranteed - expecting Python to be present is no more or less likely.
The practical utility has just been demonstrated in this particular article? Historical? I think that Unix shell are like crocodiles - outliving the dinosaurs and lurking in the murky water the unsuspecting sysadmin to come close enough to fix the script that ain't broken.
How? The author is doing something which could be done much more easily and elegantly in a programming language.
Strongly disagree. The understanding of the OS, the data, and how to checkmate the problem with minimal effort is timeless.
Programming languages are relatively ephemeral compared to POSIX utilities.
Invest in knowledge of the enduring.
Some questions:
1) Where do you draw the line, and move away from shell to a "real" language. If I just want want to view a directory, surely I use ls, right? What about if I want to remove all leading whitespace from a text file? This can be done with a short but fairly opaque awk one-liner. Probably takes way more time to write and run the Python equivalent, but I don't know Python so well, so maybe it's also a one-liner.
2) What's the optimal "real language" for replacing shell? Python? Perl? Raku?
2a) xargs automatically parallelizes programs. How can this be done efficiently (meaning I don't have to write much additional code) in your proposed "real language"?
By using xargs :)
Or more likely gnu parallel. But seriously, in so many cases using GNU parallel to parallelize a process is the quickest and easiest way to approach the problem, and I use it all the time. If I need to process 10k+ images in a folder, rather than try to parallelize the process in my python/C script I'll write the fastest possible single threaded script that takes its input args as command line arguments and then uses gnu parallel to distribute the workload. The added advantage is that I can distribute this work on a cluster of machines with only a few changes to GNU Parallel's command line arguments.
Shell is indeed very old and it's time for a replacement, but it's not there yet.
Oilshell might get there eventually or at least spark interest in this area.
Solved. https://github.com/ngs-lang/ngs
I'm the author. Frustrated with exactly this situation I created Next Generation Shell. It's a "proper programming language" on one hand but domain-specific for "DevOps"y scripting on another. So sane syntax, data structures, error handling, multiple dispatch on one hand but also syntax for running external programs, pipes and redirects.
You are welcome!
You cherry-picked the one thing that shell script is somewhat better at than other languages (not even really needed for this task). Meanwhile the article uses shell for both making HTTP requests and mangling JSON data, both of which are easy in all modern languages, and extremely painful in shell.
If the article was about using ls to just list files, and I had said "actually you should use Python's os.listdir() and filter the results by whatever" you would be right.
For most simple problems it's correct to use a simple tool. For the overwhelming majority of complex problems you should use a well-understood, well-designed, common general-purpose tool.
If this isn't the case, stay away.
ls | grep '.csv$' | xargs cat | grep 'cake' | cut -d, -f2,3 > cakes.csv
That's quite a few antipatterns in one go. Unless you have a bajillion files the `xargs` is unnecessary, the `cat` and `ls` are unnecessary (and `ls` in shell scripts is a whole class of antipatterns by itself). You might want to use something like this instead: grep cake *.csv | cut -d, -f2,3 > cakes.csv awk 'BEGIN {FS=","} /cake/ {print $2, $3}' *.csv > cakes.csv
Unsurprisingly I disagree with the post's description of awk being an "advanced command".I guess there's no categorization of "advanced" vs "beginner" that will satisfy every audience but I consider awk an advanced tool. About 20 years ago, I wrote some AWK tips and cheatsheet back on USENET and today I would have to refer to that post to write basic awk commands.
The thing about awk is that it's a compact programming language with variables and conditionals and that's a step-change in complexity for many users.
I think it can be pretty advanced, for me awk is one of those tools where I still feel like I need to write a paragraph of comments to explain 1 line of code.
For example: https://github.com/nickjj/invoice/blob/75660dce5a29ceb4e47a6...
Keep in mind I don't really "know" awk. I cobbled that together from a few examples. It will convert times formatted like "2h 30m", "150m" or "2:30" into 2.50. There's a bunch of examples in the test file.
NOTE: I wrote that script 2.5 years ago and I know there's questionable patterns in other areas of the script that's not highlighted like using a bunch of separate echo calls instead of a heredoc.
Shell scripting is really fun and efficient. I use it all the time for a variety of things.
awk -F, '/cake/ {print $2, $3}' *.csv > cakes.csv
does the same thing: -F sets the field separator.I still remember when I started working as a sysadmin at 19, the greybeard UNIX guy taught me how to vim, and he told me awk was as important as knowing vim, pointing to some huge AWK manual he had on the shelf, one of those with the animal in the cover.
This was 15 years ago, I know vim, but awk still eludes me.
But awk is usually not a good choice for processing CSV files.
Your program will fail if any of the fields include embedded commas (which is often the case).
$ ls | grep '.csv$' | xargs cat | grep 'cake' | cut -d, -f2,3
cake
cake
cake
$ grep cake *.csv | cut -d, -f2,3
bar.csv:cake
foo.csv:cake
quux.csv:cake
`grep -h` will do the trick thoughThis is a tool used to accomplish a thing, and Unix tools can be used to accomplish things in many different ways. This is like complaining about, I don't know, using a metric-labeled screwdriver on an imperial-measured screw that still gets the job done exactly as needed. Cut it out.
This is super ignorant. You risk stripping the screw and getting yourself into a frustrating screw extraction job.
Just because you lived, doesn't mean it was safe.
Anti-patterns are bad, because it usually means that sample command might work in this case, but not in other environments or other use-cases. Someone who is seeing these commands first time, has no idea about that. And this post is meant for beginners.
It is not a matter of taste and it is not a matter of metric versus imperial screwdrivers. Someone will copy this code and it will end up being an attack vector where it will have consequences.
I imagine you're rolling your eyes and have flipped the bozo bit but please bear with me.
Think of the teachable moment this presents! The author of the original piece goes back and annotates their original answer along the lines of, "you might solve it this way but there are some gotchas with it - let me show you what could go wrong."
As an industry we absolutely need to circle back with improvements so that those who come after us can build on a more solid foundation.
Parsing CSV with simple text-oriented tools is bad of an idea as parsing HTML with regexps.
I understand that you don't want to be mean but this post is neither. It's like a bad gun safety video where the alleged instructor points a loaded gun at school children and then looks down the barrel while polishing the trigger...
The find would be something like
find . -not -path '*/\.*' -type f -depth 1
What advantages does that have over 'ls' for that case?The gp was talking about issues with piping from "ls |" and your particular case of "find" being more convoluted than "ls" isn't comparing that.
Example of the topic that gp was warning about:
http://mywiki.wooledge.org/ParsingLs
https://unix.stackexchange.com/questions/128985/why-not-pars...
[Also fyi... you may have meant "-maxdepth" instead of "-depth" in your example.]
echo ^.*
will show all non-dotfiles in the current directory. rg 'MySearchString' **/*.js[x]#~*spec*
Voila. Note that this is for zsh, and you need to set the EXTENDED_GLOB option. But once you do you'll find yourself rarely needing to reach for `find`.ls |less
HR needs to know this, but it shouldn't be available to random employees.
The first column consists of lines that are only in F1, the second column consist of lines that are only in F2, and the third column consists of lines that are common to both files.
The option -1 tells it to not print column 1, -2 tells it not to print column 2, and -3 does the same for column 3. These can be combined, so -12 would only print column 3 (the lines that are in both files) and -13 would only print column 2 (the lines that are in F2 but not F1).
> 1 minute to do this
> 1 minute to do that
and 1 minute to introduce RCE vulns into company #589179283672's pipeline due to the "you don't understand the security implications of using fragile UN*X tools" problem which applies to anyone actually learning something from this article DAY OF THE SEAL SOON,
The article isn't describing such a scenario (load-bearing script).
Lectures are hosted on YouTube, they are extremely valuable and easy to follow and they give a pretty good insight on a lot of Unix topics.
Regex is one of those things I have to learn every single time I need to use it. I just can't seem to force myself to remember.
this fails for me since the jq output lines are surrounded by quotes. had to remove $. did i do something different or are we running different jq versions?