DEV Community

Alexey Melezhik
Alexey Melezhik

Posted on

Ssh-bulk-check - Super flexible check of group of nodes by ssh.

Sometimes we need to monitor that a group of hosts are in right state. Writting tests take tame. Existing tools like goss or inspec are good but they lack of flexibility. There is always something that are not covered by provided API.

New Sparrow6 plugin ssh-bulk-check allows to write check scripts to validate states of group of ssh hosts in unlimitted manner.

It's extremely flexible and comprehensive because relies on Sparrow6 Task Check DSL and plain Bash scripts that are able to cover almost any use cases.

Let me show how it works.

Install Sparrow6

zef install https://github.com/melezhik/Sparrow6.git

Create Sparrow6 Repository

export SP6_REPO=file:///tmp/repo
mkdir -p /tmp/repo
s6 --repo_init /tmp/repo
git clone https://github.com/melezhik/sparrow-plugins.git
cd sparrow-plugins && find -maxdepth 2 -mindepth 2 -name sparrow.json -execdir s6 --upload \;

Ssh hosts state

Say, we have a group of nodes, where every node has to have the same state:

  • directory /var/data exists

  • size of /tmp is no more then 1GB

  • nginx runs with no more 2 workers.

This is quite imaginative example but you've probably caught the idea.

Let's create shell script that run test commands first:

Cmd.sh

mkdir files

nano files/cmd.sh

echo '===' echo "check /var/data dir" ls -d /var/data && echo "/var/data is a directory" echo "end check" echo '===' echo "check /tmp/ dir size" sudo du -sh /tmp/ echo "end check" echo '===' echo "check if nginx is alive" ps uax| grep nginx| grep -v grep echo "end check" echo '===' 
Enter fullscreen mode Exit fullscreen mode

Now let's define check rules, that analyze output of cmd.sh script. I use Sparrow6 TaskCheck DSL with range expressions, so to tell one check from another:

State.check

between: { 'check /var/data dir' } { end \s+ check } /var/data is a directory end: note: === between: { '/tmp/ dir size' } { end \s+ check } regexp: ^^ \d+(\w+) \s+ '/tmp/' generator: <<HERE !perl if (@{matched()}){ my $order = capture()->[0]; print "assert: ", ( $order eq 'G' ? 0 : 1 ), " the size of /tmp dir is less then 1 GB\n"; } HERE end: note: === between: { 'check if nginx is alive' } { end \s+ check } /usr/sbin/nginx -g daemon on; master_process on; regexp: ^^ 'www-data' \s+ .* \s+ worker \s+ process $$ generator: <<HERE !perl if (my $cnt = @{matched()}){ print "assert: ", ( $cnt <= 2 ? 1 : 0 ), " no more 2 nginx worker launched\n"; } HERE end: 
Enter fullscreen mode Exit fullscreen mode

We are almost set, now let's run all our checks against ssh hosts, we're gonna use Sparrowdo Task runner in --localhost mode, because we run tests from localhost using ssh/sshpass utilities:

Sparrowdo scenario

sparrowfile

#!perl6 task-run "check my hosts", "ssh-bulk-check", %( cmd => "files/cmd.sh", state => "files/state.check", hosts => [ "192.168.0.1" ], ); 
Enter fullscreen mode Exit fullscreen mode

To run test just say:

sparrowdo --localhost

Here is the result:

20:01:46 04/29/2019 [check my hosts] check host [192.168.0.1] 20:01:46 04/29/2019 [check my hosts] === 20:01:46 04/29/2019 [check my hosts] check /var/data dir 20:01:46 04/29/2019 [check my hosts] /var/data 20:01:46 04/29/2019 [check my hosts] /var/data is a directory 20:01:46 04/29/2019 [check my hosts] end check 20:01:46 04/29/2019 [check my hosts] === 20:01:46 04/29/2019 [check my hosts] check /tmp/ dir size 20:01:46 04/29/2019 [check my hosts] 40K /tmp/ 20:01:46 04/29/2019 [check my hosts] end check 20:01:46 04/29/2019 [check my hosts] === 20:01:46 04/29/2019 [check my hosts] check if nginx is alive 20:01:46 04/29/2019 [check my hosts] root 1243 0.0 0.0 140628 1500 ? Ss 18:32 0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on; 20:01:46 04/29/2019 [check my hosts] www-data 1244 0.0 0.0 143300 6264 ? S 18:32 0:00 nginx: worker process 20:01:46 04/29/2019 [check my hosts] www-data 1245 0.0 0.0 143300 6264 ? S 18:32 0:00 nginx: worker process 20:01:46 04/29/2019 [check my hosts] end check 20:01:46 04/29/2019 [check my hosts] === 20:01:46 04/29/2019 [check my hosts] end check host [192.168.0.1] [task check] ==================================================== [task check] check results [task check] ==================================================== [task check] stdout match (r) </var/data is a directory> True [task check] === [task check] stdout match (r) <^^ \d+(\w+) \s+ '/tmp/'> True [task check] <the size of /tmp dir is less then 1 GB> True [task check] === [task check] stdout match (r) </usr/sbin/nginx -g daemon on; master_process on;> True [task check] stdout match (r) <^^ 'www-data' \s+ .* \s+ worker \s+ process $$> True [task check] <no more 2 nginx worker launched> True 
Enter fullscreen mode Exit fullscreen mode

For this example I use only one host - 192.168.0.1 but we can add as much hosts as we need, the plugin will check them all in one transaction.

It's dead easy to write almost any check with that plugin. Please share your use cases and I'd be glad to help you how to use ssh-bulk-check for them


Top comments (0)