I've tried a couple of times to get into awk, but still find the syntax arcane.
> an implicit loop
As an occasional awk user, I'd love if you expand on this. Maybe it will help clear things up for me. You're not referring to the fact that awk operates on every line independently, are you? for line in readfile()
for block in script:
if block.match(line)
run_block(block)
end
endfor
endfor
Where the "for line in readfile()" is the "implicit loop", and the blocks are
the "condition { .. }" blocks.The actual flow is a little bit more complex and has some exceptions e.g. (BEGIN/END), but this is about the gist of it.
... | awk '{print $2}'
I thought there was all this confusing syntax, but something like awk '/pattern/ {print}'
was more clear to me. In the first case, the empty pattern matches every line of the input, and the action is simply to print the second field of each line. Patterns can vary in complexity from the empty pattern to long chains of logical operators and regular expressions, such as /pattern/ in the second example. The outer quotes are just to prevent the shell from eating your dollar signs or other special characters. In a standalone AWK script you can write it like /pattern/ {
print
}
which also makes it look more like another language.If you can get your hands on a copy of The AWK Programming Language, it's a pretty quick and pleasant read that helped everything make more sense to me. I do most of my data analysis for my research using AWK and really enjoy working with it.