Agent Verify Itself Before Reporting “READY”

As part of building a Jenkins agent AMI in Amazon Web Services, the last plays strip any private keys or authentication tokens off the image. Once it is saved and at rest, no secrets are stored on the image.

At boot I have extensive scripting that pulls in the secrets and places them, verifies and logs. Recently I had a group of developers who managed to reach the AWS query rate limit. That’s not easy to do. What happens at that point is queries are dropped. They don’t fail, they simply don’t respond at all. Then when the query rate drops, responses are returned. I worked through this, I found ways of testing, sleeping, retesting, until the delayed operations succeeded despite the taxing of the AWS infrastructure.

Those scripts worked – but they took longer to complete. I had slaves booting, reporting ready, and then unable to build because they did not have the actual GitHub credentials as yet. I needed a way to make the agent wait until it had completed the at-boot configuration. I found the Slave Setup Plugin.

Here’s the section in the Manage Jenkins -> Configure System UI

This configuration adds a copy_to_slave directory, within which there is a script, check_slave.sh

 

#! /bin/bash

# loop up to 100 times, pause and check for github access key to be placed

COUNT=1
MAX=100
SLEEP=12

while [[ ${COUNT} -le ${MAX} ]]
do
    if [[ -f /home/ec2-user/.ssh/devopsUserApi ]]; then
        echo "found /home/ec2-user/.ssh/devopsUserApi:  slave has github key..."
        exit 0
    else
        echo "$COUNT:: SLAVE NOT READY: did not find /home/ec2-user/.ssh/devopsUserApi..."
    fi
    ((COUNT++))
    sleep ${SLEEP}
done

 

In the agent configuration block we have the agent label, plus a second label “apply_setup”, which triggers the setup behavior on the agent.

This works perfectly – here’s a slave startup log:

 

<===[JENKINS REMOTING CAPACITY]===>Remoting version: 3.25
This is a Unix agent
NOTE: Relative remote path resolved to: /home/ec2-user
Evacuated stdout
just before slave ec2-t2xlarge (i-0a192e20b26a3a7c2) gets online ...
executing prepare script ...
setting up slave ec2-t2xlarge (i-0a192e20b26a3a7c2) ...
Copying setup script files from /var/lib/jenkins/copy_to_slave
Executing script '~/check_slave.sh' on ec2-t2xlarge (i-0a192e20b26a3a7c2)
[ec2-user] $ /bin/sh -xe /tmp/jenkins322852224551927571.sh
+ /home/ec2-user/check_slave.sh
1:: SLAVE NOT READY: did not find /home/ec2-user/.ssh/devopsUserApi...
2:: SLAVE NOT READY: did not find /home/ec2-user/.ssh/devopsUserApi...
3:: SLAVE NOT READY: did not find /home/ec2-user/.ssh/devopsUserApi...
found /home/ec2-user/.ssh/devopsUserApi:  slave has github key...
script executed successfully.
slave setup done.
Agent successfully connected and online...

 

— doug