Stateless Jenkins and “checkout scm” Behavior

When you manage to make Jenkins stateless, it exposes internal assumptions throughout Jenkins, all assuming there is a past, a history, a state.

In Jenkins pipeline jobs the basic checkout statement is “checkout scm”.

This in my experience does the right thing consistently, on a Jenkins controller that has been up for awhile and has a commit and build history assembled over time. As soon as Jenkins is placed as one of 10 controllers behind an AutoScaling Group and an Elastic Load Balancer in AWS, and jobs are created from a notifyCommit sent by GitHub to the address at the front of the Elastic Load Balancer (ELB), there is no state and that checkout module falls over.

GitHub sends the branch and commit hash information as part of the commit.

Jenkins ignores it. It you hack around in the git plugin, you find that the hash and branch are not passed to it. Instead Jenkins examines GitHub and derives the commit hash and branch for itself. If there is no history for the job, Jenkins git plugin determines the commit hash based on the latest commit to the alphabetically first branch of the repo. Not kidding.

In a buildfarm cloud, any Jenkins controller can disappear at any time – there are no guarantees of a history or past. The jobs are configured by GitHub sending a notifyCommit, which causes the CICD Discover plugin to configure and build a brand new pipeline job with no build history. Jenkins in this config is massively scalable. Because it has no state, peculiarities start to show up, like the initial checkout behavior.

I worked around this. The CICD Discover plugin checks the latest commit hash, and in configuring a pipeline job, on first build restricts the build to that latest commit hash – it turns out the line in the pipeline config for Branch Specifier can take a commit hash

Instead of “**”, you can put the full commit hash and restrict the build to exactly that.

This works up to a point – Jenkins pipeline jobs have TWO checkouts. The first retrieves the Jenkinsfile (in this case from cicd/pipeline/Jenkinsfile from the configured repo url). The second is the directive inside the Jenkinsfile, to checkout and build the repo. If that Jenkinsfile call uses “checkout scm”, the internal code, you lose the discrimination and are back to randomly building the latest commit on the alphabetically first branch found. This piece does restrict though, giving you a predictable Jenkinsfile – you know which Jenkinsfile will be employed.

The next step was to build a replacement checkout library to be used in the Jenkinsfile, and alternative to “checkout scm”.

— doug