Originally published November 19, 2021 @ 8:20 pm

To make a long story short, I have a list of servers where I need to execute a command and get back the output. Using a for loop to run SSH with key authentication is the usual approach, except in this case accessing one server at a time was taking too long.

In a situation like this, I would usually opt for PDSH but for other reasons I needed to use SSH and xargs with a list of target hosts – one per line. The basic syntax would look something like this:

cat host_list.txt | \
xargs -d $'\n' -P$(grep -c proc /proc/cpuinfo) -n1 -I{} \
/usr/bin/ssh -qtT -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i "${ssh_key}" ${ssh_user}@{} \
"sudo su - root -c 'wc -l /etc/{passwd,shadow}'" 2>/dev/null

So I am passing my list of target hosts to xargs; kicking off a number of parallel SSH connections (based on the number of CPU cores); logging in with some SSH user name and key; and executing a command as root on the remote systems.

The issue here is this: the output of the command I am running will have three lines. Because the SSH connections are launched in parallel, the responses will arrive in a random sequence and I will have no way of knowing which line of the output came from which remote host.

 62 /etc/passwd
 61 /etc/shadow
123 total
 48 /etc/passwd
 48 /etc/shadow
 96 total

People usually take care of this problem by appending the target hostname to the command:

c="wc -l /etc/{passwd,shadow}"
cat host_list.txt | \ 
xargs -d $'\n' -P$(grep -c proc /proc/cpuinfo) -n1 -I{} \ 
/usr/bin/ssh -qtT -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i "${ssh_key}" ${ssh_user}@{} \ 
"sudo su - root -c 'echo "${HOSTNAME}:"; ${c}; echo'" 2>/dev/null

Now the output is a lot easier to interpret:

ncc1701:
  62 /etc/passwd
  61 /etc/shadow
 123 total

ncc1711:
  48 /etc/passwd
  48 /etc/shadow
  96 total

But, because xargs runs the same command on a bunch of remote hosts simultaneously and the amount of time it takes to generate output would differ from host to host, it is entirely possible that the output will arrive out of order.

To illustrate this point, I will change the remote command to, first, count the number of lines in /etc/passwd, then sleep for a random number of seconds between 1 and 10, and only then count the number of lines in /etc/shadow:

c="wc -l /etc/passwd; sleep $(( ( RANDOM % 10 ) + 1 )); wc -l /etc/shadow"
cat host_list.txt | \ 
xargs -d $'\n' -P$(grep -c proc /proc/cpuinfo) -n1 -I{} \ 
/usr/bin/ssh -qtT -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i "${ssh_key}" ${ssh_user}@{} \ 
"sudo su - root -c 'echo "${HOSTNAME}:"; ${c}; echo'" 2>/dev/null

ncc1701:
62 /etc/passwd
ncc1711:
48 /etc/passwd
48 /etc/shadow

61 /etc/shadow

Even with a simple task like counting the number of lines in /etc/passwd and /etc/shadow making sense of the unordered output is impossible. One way around this is to prepend each line of the output with a random value. This value can be used to group the output for each host:

c="wc -l /etc/passwd; sleep $(( ( RANDOM % 10 ) + 1 )); wc -l /etc/shadow"
cat host_list.txt | \
xargs -d $'\n' -P$(grep -c proc /proc/cpuinfo) -n1 -I{} \
/usr/bin/ssh -qtT -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i "${ssh_key}" ${ssh_user}@{} \ 
"sudo su - root -c 'echo "${HOSTNAME}:"; ${c}; echo' | awk -v var="${RANDOM}" '{print var,\
c="wc -l /etc/passwd; sleep \$(( ( RANDOM % 10 ) + 1 )); wc -l /etc/shadow"
cat host_list.txt | \
xargs -d $'\n' -P$(grep -c proc /proc/cpuinfo) -n1 -I{} \
/usr/bin/ssh -qtT -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i "${ssh_key}" ${ssh_user}@{} \ 
"sudo su - root -c 'echo "\${HOSTNAME}:"; ${c}; echo' | awk -v var="\${RANDOM}" '{print var,\$0}'" 2>/dev/null | awk '{first = $1; $1 = ""; print $0; }'
}'" 2>/dev/null | awk '{first = $1; $1 = ""; print $0; }'

Now the output will be always grouped by host:

ncc1711:
48 /etc/passwd
48 /etc/shadow
96 total

ncc1701:
62 /etc/passwd
61 /etc/shadow
123 total