cat file | grep 'expression'
is the shell analogue of the SQL query select * from (select * from table) as q where column like '%expression%'
...except that a decent query optimizer will collapse the extraneous inner query. Bash doesn't know how to do that.It's also more expensive:
# wc -l /var/log/secure
97845 /var/log/secure
# time cat /var/log/secure | grep root > /dev/null
real 0m1.600s
user 0m1.517s
sys 0m0.294s
# time grep root /var/log/secure > /dev/null
real 0m1.275s
user 0m1.237s
sys 0m0.036s
That's an extra .3-and-change seconds, or over 25% longer, on a 98k line file. Scale that up to a multi-million line ngnix log file, and I'd actually say it's a worse than useless use of cat.You could condense the first three stages: cat /var/log/nginx-access.log | grep "GET" | awk -F'"' '{print $6}'
down into: awk -F'"' '/GET/{print $6}' /var/log/nginx-access.log as well.
Or with the fourth stage (cut -d" " -f1): awk -F'"' '/GET/{split($6,a," ");print a[0]}' /var/log/nginx-access.log
Add the fifth stage (grep -E "^[[:alnum:]]) by doing:
awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){print a[0]}}' /var/log/nginx-access.log
And the 6th and 7th (sort | uniq -c):
awk -F'"' '/GET/{split($6,a," ");if(a[0]~/^[[:alnum:]]/){b[a[0]]++}}END{for(i in b){print b[i} " " i}}' /var/log/nginx-access.log | sort -rn
Which is still just about a one liner and a lot faster then the original. Actually you could make it shorter, even faster, and more awky by tinkering with the field-separator to get rid of the split() and if:
awk -F'[[:space:]]|"' '/GET/ && $17~/^[[:alnum:]]/{a[$17]++}END{for(i in a){print a[i] " " i}}' | sort -rn
And then replace the last sort with awk's builtin asort() function but thats left as an exercise for the student ;)
But why learn the basics when you can be do Big Data and be buzz word compliant instead.
If only I didn't have exams! (Been working on it 50% of the time though.)
Please, people, if you're using Blogspot, stop. Or at the very least, avoid the craptacular "dynamic" themes like the pox they are.
Like when people shed a "new" light on seemingly simple tools.