I've tried to parallelize script I'm using, but so far GNU Parallel is very challenging.
I've got 2 files - one containing hosts on which to run command and second having params for command. Below is sample data:
$ cat workers.host [email protected] [email protected] [email protected] [email protected] $ cat paths /usr/local/jar/x/y/ jarxy /usr/local/jar/z/y/ jarzy /usr/local/jar/y/y/ jaryy /usr/local/far/x/y/ farxy /usr/local/jaz/z/z/ jazzz /usr/local/mu/txt/ana/ acc01 /usr/local/jbr/x/y/ accxy And to process that, I use following script:
#!/bin/bash echo "Run this on 192.168.130.10"; DATA=`date +%F` DDAY=`date +%u` DOMBAC='nice tar cpzf' readarray -t hosts < workers.host len=${#hosts[@]}; processed=0; while read -r -a line; do let hostnum=processed%len; ssh ${hosts[$hostnum]} -i /root/.ssh/id_rsa "$DOMBAC - ${line[0]}" > "/data/backup/$DDAY/${line[1]}_${DATA}_FULL.tgz" let processed+=1; done < paths This works well, however processes step-by-step on machine after machine. Hosts are quite overpowered and network isn't a problem here, so I wanted to parallelize this as much as possible. For example run 4 instances of tar command on each host and pipe output through ssh into properly named file. I am completely lost with parallel --results --sshloginfile... And what I ultimately try to accomplish is to have 4 jobs running on each host, each and every one with different params (so that for example host 2 doesn't overwrite what host 1 already did). Can this be done in GNU Parallel?