GNU Parallel Citation Notice (opens in new tab)

(git.savannah.gnu.org)

8 pointsalasarmas1y ago6 comments

6 comments

ggm1y ago

Possibly the first nagware I ever used. Some variants of xargs can do some of what parallel does, if the nag annoys you. I am unsure how smooth the parallelism is in that model and they differ on argument quoting norms so it's not a simple drop in replacement necessarily.

Joel_Mckay1y ago

There are some subtle differences, and on rare occasion xargs won't do when:

1. the order of single line output results does not need preserved

2. long running parallel tasks are non-blocking for efficiency reasons

3. optionally need to include remote computers in a cluster

Toy example IP blacklist preparation:

cat ./banlist_ipv4.raw | parallel --ungroup --eta --jobs 24 "ipcalc {} | sed '2!d' " | grep -Ev '^(0.|255.|127.)' > ./banlist_ipv4.formatted

In this toy case, the child processes may be loaded hundreds of thousands of times. Thus, the random exiting parallel child processes avoiding blocking/waits reduces runtime cost.

Its FOSS, people shouldn't feel entitled to complain about how authors share their works. =3

upon_drumhead1y ago

I've always struggled getting parallel working so I played around with your example. I generated the raw list via

for i in $(seq 10000); do echo "$((RANDOM % 255)).$((RANDOM % 255)).$((RANDOM % 255)).$((RANDOM % 255))" >> banlist_ipv4.raw; done

Running the example, the sed '2!d' is really not working for me. It keeps throwing "bash: !d': event not found" and I can't seem to find the right escapes for it to work.

I modified it to this

time $(parallel --ungroup --eta --jobs 24 ipcalc --nobinary --nocolor {} :::: banlist_ipv4.raw | awk '/Network/{ print $2 }' > banlist_ipv4.formatted)

Which ran in 1m3.876s

I then wrote it for xargs

time $(xargs --arg-file=banlist_ipv4.raw --max-procs=24 -I{} ipcalc --nobinary --nocolor {} | awk '/Network/{ print $2 }' > banlist_ipv4.formatted)

which runs in 0m42.346s

I'd love to understand a little bit better what you were trying to show in your example and why parallel in my example seems to be at least a third slower for the same input and task.

Removing the eta calculations from the parallel example doesn't change the runtime nor does adding pv to get progress with the xargs example. It's just not a meaningful amount of cost.

Joel_Mckay1y ago

One may need to run it in bash, and agree to the nag screen first:

~$ parallel --citation

~$ ./test.sh

#!/bin/bash

echo "note: bench-marking a long running task"

echo "BASH_VERSION=${BASH_VERSION}"

date

#jobs=0 will spawn as many as possible at one job per core

cat "./blacklistp_p2p" | parallel --ungroup --eta --jobs 0 "ipcalc {} | sed '2!d' " | grep -Ev '^(0.|255.|127.)' >> ./blacklist_p2p_converted

date

exit 0

#note the cluster network version after test run

For your example, the only difference I saw was the ">>", which shouldn't prevent the example from running.

In practice, we saw around a 30% task completion reduction in this task due to the random queue blocking time xargs generates to preserve ordered output.

Thus, unless one pins all cpu cores with hundreds of processes for several minutes... the overall completion time may differ from what was expected.

Note, the performance likely also depends on the in-ram Linux kernel Page Cache of your filesystem, and the child process execution time variability. i.e. if you are running it in an emulated/VM/WSL environment the batching may behave differently.

On average, this toy checks/converts several hundred thousand IP "3.0.0.0 - 3.127.255.255" ranges into a CIDR subnet notation "3.0.0.0/9". My example may now contain typos as the example is quite dated.

YMMV, and I hope you are able to replicate the fun =)

1 more reply

j / k navigate · click thread line to collapse

6 comments

ggm1y ago

Joel_Mckay1y ago

There are some subtle differences, and on rare occasion xargs won't do when:

1. the order of single line output results does not need preserved

2. long running parallel tasks are non-blocking for efficiency reasons

3. optionally need to include remote computers in a cluster

Toy example IP blacklist preparation:

cat ./banlist_ipv4.raw | parallel --ungroup --eta --jobs 24 "ipcalc {} | sed '2!d' " | grep -Ev '^(0.|255.|127.)' > ./banlist_ipv4.formatted

In this toy case, the child processes may be loaded hundreds of thousands of times. Thus, the random exiting parallel child processes avoiding blocking/waits reduces runtime cost.

Its FOSS, people shouldn't feel entitled to complain about how authors share their works. =3

upon_drumhead1y ago

I've always struggled getting parallel working so I played around with your example. I generated the raw list via

for i in $(seq 10000); do echo "$((RANDOM % 255)).$((RANDOM % 255)).$((RANDOM % 255)).$((RANDOM % 255))" >> banlist_ipv4.raw; done

Running the example, the sed '2!d' is really not working for me. It keeps throwing "bash: !d': event not found" and I can't seem to find the right escapes for it to work.

I modified it to this

time $(parallel --ungroup --eta --jobs 24 ipcalc --nobinary --nocolor {} :::: banlist_ipv4.raw | awk '/Network/{ print $2 }' > banlist_ipv4.formatted)

Which ran in 1m3.876s

I then wrote it for xargs

time $(xargs --arg-file=banlist_ipv4.raw --max-procs=24 -I{} ipcalc --nobinary --nocolor {} | awk '/Network/{ print $2 }' > banlist_ipv4.formatted)

which runs in 0m42.346s

I'd love to understand a little bit better what you were trying to show in your example and why parallel in my example seems to be at least a third slower for the same input and task.

Removing the eta calculations from the parallel example doesn't change the runtime nor does adding pv to get progress with the xargs example. It's just not a meaningful amount of cost.

Joel_Mckay1y ago

One may need to run it in bash, and agree to the nag screen first:

~$ parallel --citation

~$ ./test.sh

#!/bin/bash

echo "note: bench-marking a long running task"

echo "BASH_VERSION=${BASH_VERSION}"

date

#jobs=0 will spawn as many as possible at one job per core

cat "./blacklistp_p2p" | parallel --ungroup --eta --jobs 0 "ipcalc {} | sed '2!d' " | grep -Ev '^(0.|255.|127.)' >> ./blacklist_p2p_converted

date

exit 0

#note the cluster network version after test run

For your example, the only difference I saw was the ">>", which shouldn't prevent the example from running.

In practice, we saw around a 30% task completion reduction in this task due to the random queue blocking time xargs generates to preserve ordered output.

Thus, unless one pins all cpu cores with hundreds of processes for several minutes... the overall completion time may differ from what was expected.

YMMV, and I hope you are able to replicate the fun =)

1 more reply

j / k navigate · click thread line to collapse