Intuitive Engineering

terraform and personal websites…

dougmunsinger — Sat, 19 Oct 2024 20:19:47 +0000

I’ve been neglecting this site since December, 2019.

Around that time I went through yet another change in ownership of the company I worked for, or at least a change in the make up of the organization.

I started at a company named Jumptap, which was almost immediately acquired by Millennial Media, which was then acquired a couple of years later by AOL, which was actually owned by Verizon. Three companies, same desk…

A year or so after that, Verizon acquired Yahoo, and attempted to combine AOL and Yahoo into a new company named Oath. Around this point, I and the team I was working with found oursaelves increasingly sidelined, as Yahoo more or less consumed the AOL personnel, either assimilating them into Yahoo-oriented structures or placing them in positions where the sane thing was to leave. Many of the AOL people did leave.

Here’s the thing. AOL was one of the best run, sanest management teams I’ve ever worked for. I was surprised. But it’s true. They were East-coast (based in Dulles, Virginia) and not consumed by the Silicon Valley management bullshit.

Once Yahoo came into the picture, with 9000 head count, AOL’s 4000 headcount was overwhelmed. and the Oath management style became Silicon Valley. Self-centered, self-congratulatory, certain of their ability despite facts – income, profitability, share price – showing they were incompetent.

I stayed until I was laid off with a severance, went to a sports-oriented company, bailed to an insurance start-up, and finally wound up at Toast.

Right in time for Toast to lay off half their staff as the pandemic hit their customers, restaurants, hard.

I survived that. And it is nice to be working for a company that has definitive positive impact. I could never say that for the media/ad companies – when someone asked what my company did, I said, you know those annoying ads that pop up and follow you from device to device? Where if you search for underwear once, it follows you around for months after?

That’s who I work for.

I read after I left that Oath got renamed to Verizon Media. And after that, Verizon dumped Yahoo back on its own at a loss. And AOL is gone, excet for the remnants of email addresses some people still have.

I’ve had servers at Digital Ocean for quite a while, about 8 years now. I’ve been ok with it, but I’ve been working extensively with terraform and AWS at Toast, and I determined to leave Digital Ocean and instantiate my personal websites in AWS.

That’s part of why this website/blog is now coming back to life.

It’s been moved, reconstituted, the posts imported.

single source of truth (env vars)

dougmunsinger — Wed, 11 Dec 2019 18:40:28 +0000

I came into a new project recently. One of the challenges was that the CICD pieces almost worked for developers, but fell short and all of the QA was being done because of that on the developer’s local laptop.

It works on my local… Actually and in fact.

Each developer would announce to the group new environmental variables required to run the app, and the developers present would add that env var to their personal .bashrc. Any developer not present, or missing the announcement of the change in slack or by email, would discover it on their own when their local stopped working after pulling down the latest code.

At some point his had to stop. The env var sets were used in docker-compose, to bring in environmental vars for the container image, in gitlab cicd pipelines, providing the env vars to test and to build images and deploy to s3, and in deployments to Kubernetes.

The resolution was top have a single source of truth for env vars, one for each environment. And to then create ansible templates to create docker-compose files when building images, env var files to source in gitlab CICD pipelines and then in kubernetes deployments.

I created a common directory to hold the ansible plays and templates, repo/common. Inside that directory are the ansible plays and templates for gitlab environmental vars, docker-compose creation, and creating the deployment yaml for kubernetes.

environmental vars for gitlab-ci.yml jobs

[python]

dsm_macbook:common mm26994 $ cat annuity_create_gatsby_vars.j2
#!/bin/bash

# local env var file to source for gatsby build process
# brings in the same env var file as the kubernetes deployment

{% for environ in app_environs -%}
export {{ environ[‘envname’] }}='{{ environ[‘envvalue’] }}’
{% endfor -%}

[/python]

I love jinja templating. This looping construct takes each entry in the environmental vars single source of truth file and puts it into a .profile or .bashrc type file which can then be sourced inline in the gitlab cicd job to instantiate the env vars in the cicd shell.

The play is:
[python]

—
# ansible playbook to populate env vars file to source for gatby

– hosts: localhost
become: yes
become_user: root

tasks:
# vars are scoped as variable_name
– name: annuity create gatsby vars | pull in vars
include_vars:
file: ‘{{ ENV_VAR_FILE }}’

# create the deployment file from template
# template references variables from imported file
# edits are made in that file
– name: annuity create gatsby vars | template the gatsby envvars source file
template:
src: ../common/annuity_create_gatsby_vars.j2
dest: ‘{{ ENVVAR_FILE_GATSBY }}’
mode: 0755
# local file _env_var is available
dsm_macbook:common mm26994 $

[/python]

This imports the env var file (which is passed from the gitlab-ci.yml file job script code as ansible-playbook –extra-vars “vars are defined in here”

The env file looks like

[python]

dsm_macbook:uatjr mm26994 $ cat uatjr_env_vars.yml
—
app_environs:
– envname: ‘ENV_NAME’
envvalue: ‘uatjr’

– envname: ‘QUILT_PORT’
envvalue: ‘4000’

– envname: ‘QUILT_ENDPOINT’
envvalue: ‘/graphme’

etc…

app_certs:
– envname: ‘PUBLIC_CERT’
envvalue: |-
|-
—–BEGIN CERTIFICATE—–
multiline key hash…
—–END CERTIFICATE—–
– envname: ‘PRIVATE_KEY’
envvalue: |-
|-
—–BEGIN RSA PRIVATE KEY—–
multiline key hash…
—–END RSA PRIVATE KEY—–
REPLICAS: ‘3’
DEPLOYMENT_NAME: ‘annuity-uat-deploy’
CONTAINER_NAME: ‘annuity-uat-cont’

[/python]

docker-compose

The same pattern works for docker-compose files.

The template – actually there are two different templates, one for production and one for everything else.

[python]

dsm_macbook:common mm26994 $ cat annuity_create_docker-compose.dev.j2
version: ‘3.0’

services:
postgres:
image: postgres
restart: always
ports:
– ‘5432:5432’
environment:
POSTGRES_USER: admin
POSTGRES_PASSWORD: password
POSTGRES_DB: annuity
logging:
driver: awslogs
options:
awslogs-group: /ecs/annuity-dev
awslogs-region: aws_region
awslogs-stream-prefix: annuity

annuity:
image: ecr repo address:${CI_COMMIT_SHORT_SHA}
build:
context: ../../
dockerfile: docker/dev/Dockerfile
restart: ‘no’
environment:
{% for environ in app_environs -%}
{{ environ[‘envname’] }}: ‘{{ environ[‘envvalue’] }}’
{% endfor -%}
# throwaway line to buffer indent artifacts…
depends_on:
– postgres
ports:
– ‘3400:3400’
– ‘8080:8080’
logging:
driver: awslogs
options:
awslogs-group: /ecs/annuity-dev
awslogs-region: us-east-2
awslogs-stream-prefix: annuity
entrypoint: run

[/python]

and the play…

[python]

dsm_macbook:common mm26994 $ cat annuity_create_docker-compose.yml
—
# ansible playbook to populate docker-compose.yml template

– hosts: localhost
become: yes
become_user: root

tasks:
# vars are scoped as variable_name
– name: annuity create docker-compose.yml | pull in vars
include_vars:
file: ‘{{ ENV_VAR_FILE }}’

# create the docker-compose.yml file from template
# template references the variable_name from imported file
# edits are made in that file
– name: annuity create docker-compopse.yml | template the deployment
template:
# annuity_create_docker-compose.master.j2 (master)
# annuity_create_docker-compose.dev.j2 (dev)
src: ‘{{ DOCKER_COMPOSE_TEMPLATE }}’
# DOCKERE_COMPOSE_FILE will be the relative path to drop the docker-compose.yml file,
# e.g., ../../docker/dev/docker-compose.yml
dest: ‘{{ DOCKER_COMPOSE_FILE }}’
mode: 0644
# this will be called in .gitlab-ci.yml, after cd /docker/[master|dev]
# so creates a local docker-compose.yml

[/python]

there is a different template for production as that has slightly different pathing and requirements.

kubernetes deployment

Same pattern

[python]

dsm_macbook:common mm26994 $ cat annuity_create_deployment.
annuity_create_deployment.j2 annuity_create_deployment.yml
dsm_macbook:common mm26994 $ cat annuity_create_deployment.j2
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: {{ DEPLOYMENT_NAME }}
spec:
replicas: {{ REPLICAS }}
selector:
matchLabels:
app: annual
template:
metadata:
labels:
app: annual
spec:
containers:
– name: {{ CONTAINER_NAME }}
image: ECR repo:{{ DEPLOYMENT_IMAGE }}
imagePullPolicy: Always
env:
{% for environ in app_environs -%}
– name: {{ environ[‘envname’] }}
value: ‘{{ environ[‘envvalue’] }}’
{% endfor -%}
{% for cert in app_certs -%}
– name: {{ cert[‘envname’] }}
value: {{ cert[‘envvalue’] }}
{% endfor -%}
# this line buffers indent artifact…

command: [‘/bin/sh’]
args: [‘-c’, ‘start:server’]
ports:
– name: serverport
containerPort: 3400
– name: restport
containerPort: 7070

[/python]

single source of truth

From this devs can edit a single file and it will be distributed as needed out into gitlab, docker and kubernetes.

— Doug

ECS Structure via Boto3

dougmunsinger — Thu, 27 Jun 2019 16:59:15 +0000

This starts with an abstracted config file…

[python]
### variables (abstract these further and pull in as a seprate file – then create a new file for a new environment)
region = ‘aws region’
cluster_name = ‘cluster’
amiid = ‘ami-123456789123’
instance_type = ‘t2.medium’
key_name = ‘aws key name’
subnet_id_1 = ‘subnet-123456789123’
subnet_id_2 = ‘subnet-123456789125’
sec_group = ‘sg-123456789123′
user_data_script = """#!/bin/bash
CLUSTER=’cluster’
echo "ECS_CLUSTER=${CLUSTER}" >> /etc/ecs/ecs.config"""
instance_profile_arn = ‘arn:aws:iam::accountid:instance-profile/aws-role-name’
container_name = ‘ecs container name’
task_family = ‘task family’
docker_image = ‘accountid.dkr.ecr.us-east-1.amazonaws.com/repo_name:git-short-commit-hash’
env_name = ‘production’
task_port = 3030

# checking containers after creating…
wait_time = 22 # pause 22 seconds
check_retries = 42 # test 42 times

# internet-facing load balancer
elb_name = ‘name for external elb’
elb_subnet_1 = ‘subnet-123456789126’
elb_subnet_2 = ‘subnet-123456789127’
elb_subnet_3 = ‘subnet-123456789128’
elb_sec_group = ‘sg-123456789567’

target_group_name = ‘target name’
vpc_id = ‘vpc-123456789890’
certificate_arn = ‘arn:aws:acm:us-east-1:accountid:certificate/amazon-random-number’
service_desired_count = 2
service_name = ‘name for ecs service’

# service rollover tuning
max_percent = 100
min_healthy_percent = 50
[/python]

Which then gets imports as “cfg.variable” and used in the script…

[python]

#!/bin/python

import sys, boto3, logging, json, time

# bring in config file
import prod_sso_api_ecs_config as cfg

# this comes after populating ECR with the docker image, by hand at first and then through jenkins once all this comes together…
# add id_rsa key to access bitbucket
# git clone
# cd repo
# docker build -t catapult: .
# get the git shorthash
# git rev-parse HEAD
# docker tag image
# aws ecr get-login –region us-east-1
# then enter the returned login creds – MAY have to remove "-e none" from return
# docker push image
# then add_tags to tag as app too
# now the initial image to work with is populated

# script to create
# cluster
# container instances
# task definition
# elb
# listener
# target group
# then service

def main():
"""
call create_cluster()
"""
result = str(create_cluster())
print("result create_cluster: " + result)
instance_1, instance_2 = create_instances()
# I need to save each instance
print("instances:")
print(instance_1)
print(instance_2)
check_instances()
task_def_arn = create_task_def()
print("result of create_task_def: " + str(task_def_arn))
lb_arn = create_elb()
print("result of create_elb: " + str(lb_arn))
target_group_arn = create_target_group()
print("result of create_target_group: " + str(result))
register_targets(instance_1, instance_2, target_group_arn)
print("result of register targets: " + str(result))
create_listener(target_group_arn, lb_arn)
print("result of create_listener: " + str(result))
create_service(task_def_arn, target_group_arn, cfg.container_name, cfg.task_port)
print("result of create_service: " + str(result))

def create_cluster():
"""
create new cluster
"""
client = boto3.client(‘ecs’, region_name=cfg.region)
response = client.create_cluster(
clusterName=cfg.cluster_name
)
return response

def create_instances():
"""
create container instances for cluster
this requires an ECS aware amazon image
"""
ec2 = boto3.client(‘ec2’, region_name=cfg.region)
instance_1 = ec2.run_instances(
ImageId=cfg.amiid,
MinCount=1,
MaxCount=1,
InstanceType=cfg.instance_type,
KeyName=cfg.key_name,
SubnetId=cfg.subnet_id_1,
SecurityGroupIds=[
cfg.sec_group,
],
IamInstanceProfile={
‘Arn’:cfg.instance_profile_arn,
},
UserData=cfg.user_data_script,
TagSpecifications=[
{
‘ResourceType’:’instance’,
‘Tags’: [
{
‘Key’: ‘Name’,
‘Value’: cfg.container_name
},
]
},
]
)
instance_2 = ec2.run_instances(
ImageId=cfg.amiid,
MinCount=1,
MaxCount=1,
InstanceType=cfg.instance_type,
KeyName=cfg.key_name,
SubnetId=cfg.subnet_id_2,
SecurityGroupIds=[
cfg.sec_group,
],
IamInstanceProfile={
‘Arn’:cfg.instance_profile_arn,
},
UserData=cfg.user_data_script,
TagSpecifications=[
{
‘ResourceType’:’instance’,
‘Tags’: [
{
‘Key’: ‘Name’,
‘Value’: cfg.container_name
},
]
},
]
)
id_1_raw = extract_values(instance_1, ‘InstanceId’)
id_1 = id_1_raw[0]
id_2_raw = extract_values(instance_2, ‘InstanceId’)
id_2 = id_2_raw[0]
return id_1, id_2

def check_instances():
"""
check for instances up and registered
"""
count = 0
ecs_client = boto3.client(‘ecs’, region_name=cfg.region)
while count < cfg.check_retries:
container_instances_response = ecs_client.list_container_instances(
cluster=cfg.cluster_name
)
# isolate containerInstanceArns
containerarns = container_instances_response[‘containerInstanceArns’]
if len(containerarns) == 2:
print("found instances")
return 0
else:
print(len(containerarns))
count = count + 1
time.sleep(cfg.wait_time)
else:
print("failed to find instances spun up for container use…")
return 1

def create_task_def():
"""
create a task definition
"""
client = boto3.client(‘ecs’, region_name=cfg.region)
response = client.register_task_definition(
family=cfg.task_family,
containerDefinitions=[
{
‘name’: cfg.container_name,
‘image’: cfg.docker_image,
‘cpu’: 0,
‘memory’: 1024,
‘essential’: True,
‘environment’: [{
‘name’: ‘NODE_ENV’,
‘value’: cfg.env_name
}],
‘portMappings’: [
{
‘containerPort’: cfg.task_port,
‘hostPort’: cfg.task_port,
‘protocol’: ‘tcp’
},
]
}
]
)
task_def_arn_list = extract_values(response, ‘taskDefinitionArn’)
task_def_arn = task_def_arn_list[0]
return task_def_arn

def create_elb():
"""
create elbv2 load balancer
"""
conn = boto3.client(‘elbv2′, region_name=cfg.region)
response = conn.create_load_balancer(
Name=cfg.elb_name,
Subnets=[cfg.elb_subnet_1, cfg.elb_subnet_2, cfg.elb_subnet_3],
SecurityGroups=[cfg.elb_sec_group],
Scheme=’internet-facing’
)
lb_arn_raw = extract_values(response, ‘LoadBalancerArn’)
lb_arn = lb_arn_raw[0]
return lb_arn

def create_target_group():
"""
create target group
"""
conn = boto3.client(‘elbv2′, region_name=cfg.region)
response = conn.create_target_group(
Name=cfg.target_group_name,
Protocol=’HTTP’,
Port=cfg.task_port,
VpcId=cfg.vpc_id,
HealthCheckProtocol=’HTTP’,
HealthCheckPort=str(cfg.task_port),
HealthCheckPath=’/health’,
HealthCheckIntervalSeconds=12,
HealthCheckTimeoutSeconds=10,
HealthyThresholdCount=3,
UnhealthyThresholdCount=3,
Matcher={‘HttpCode’: ‘200’})
target_group = response.get(‘TargetGroups’)[0]
# target_group_arn is global (defined outside of and before main)
# and reused to register targets below
target_group_arn = target_group[‘TargetGroupArn’]
return target_group_arn

def register_targets(id_1, id_2, target_group_arn):
"""
register containers as targets
"""
conn = boto3.client(‘elbv2’, region_name=cfg.region)
response = conn.register_targets(
TargetGroupArn=target_group_arn,
Targets=[
{
‘Id’: id_1,
‘Port’: cfg.task_port,
},
{
‘Id’: id_2,
‘Port’: cfg.task_port,
},
])

def create_listener(target_group_arn, lb_arn):
"""
create a listener
"""
conn = boto3.client(‘elbv2′, region_name=cfg.region)
response = conn.create_listener(
LoadBalancerArn=lb_arn,
Protocol=’HTTPS’,
Port=443,
Certificates=[
{‘CertificateArn’: cfg.certificate_arn}],
DefaultActions=[{‘Type’: ‘forward’, ‘TargetGroupArn’: target_group_arn}]
)
return response

def create_service(task_def_arn, target_group_arn, container_name, task_port):
"""
create the service
"""
client = boto3.client(‘ecs’, region_name=cfg.region)
response = client.create_service(
cluster=cfg.cluster_name,
serviceName=cfg.service_name,
taskDefinition=task_def_arn,
desiredCount=cfg.service_desired_count,
loadBalancers=[
{
‘targetGroupArn’: target_group_arn,
‘containerName’: cfg.container_name,
‘containerPort’: cfg.task_port
},
],
deploymentConfiguration={
‘maximumPercent’: cfg.max_percent,
‘minimumHealthyPercent’: cfg.min_healthy_percent
},
healthCheckGracePeriodSeconds=30,
launchType=’EC2′,
)

# extract value given key from boto3 return
def extract_values(obj, key):
"""Pull all values of specified key from nested JSON."""
arr = []

def extract(obj, arr, key):
"""Recursively search for values of key in JSON tree."""
if isinstance(obj, dict):
for k, v in obj.items():
if isinstance(v, (dict, list)):
extract(v, arr, key)
elif k == key:
arr.append(v)
elif isinstance(obj, list):
for item in obj:
extract(item, arr, key)
return arr

results = extract(obj, arr, key)
return results

if __name__ == ‘__main__’:
main()

[/python]

— doug

Essential Management

dougmunsinger — Thu, 20 Jun 2019 17:01:17 +0000

Good managers are gems.

Especially technical managers.

One of the things that happens as careers continue is competent people are offered the opportunity to manage. First at the project and team level and then as a more formal position with formal direct reports and hierarchy.

A really good technical sysadmin or programmer may or may not become a good manager.

The first thing they need to realize is they are not already competent at management despite being a technical God and a nice person. Whatever that means.

Management is a very specific skill. Technical people thrown in to this fail, without really grasping that this is NOT a technical problem, and that your technical skills don’t help you in this.

The first hurdle is really getting this is not the same skill, and realizing you have a bit to learn to do it well.

The second is to let your technical expertise be secondary to your team’s expertise. Let your people do what they are good at and were hired to do. Trust your people. Let go, and do not micromanage and try and steer the solutions exactly as they would have been if you did it yourself. You are not doing it yourself anymore, this is your team instead. Let them do good work as much on their own as you can.

Push barriers aside. Remove meetings, push away other manager’s distractions. Protect and nurture your people.

Be mindful of maker time vs manager time. Let your people decide if they attend any meetings scheduled after say 10 AM. And really give them the ability to say no.

Remove distractions, point the team and stand aside out of the way. And thank god for the “Individual Contributor” job path because hell for me would be management, unless I own the company.

Just sayin.

— doug

boto3 (AWS & Python)

dougmunsinger — Mon, 03 Jun 2019 14:33:25 +0000

I’ve worked through the Amazon AWS CLI commands to create and then to deploy an ECS cluster using ec2 containers and an Application Load Balancer to deploy a docker nodejs app.

In bash. Because the straightest line between command line and automation with no wasted motion is bash. Once.

But if you want a supportable script that can be extended, read in six months and understood, this can’t really stay bash. The only people who write bash with respect are sysadmins. System Ops. But Devops requires developers to be able to read, extend and support scripting at times, and the nature of bash makes reusable modular code require a lot of hoops to emulate. Better to move to python.

AWS boto3 has, once you start crafting the commands to use it, complete access to anything available directly through the command line. The first boto3 function I needed looked for an AMI with a regex name string and a tag Status set to “Active”


def find_ecs_ami(region):
    """
    AWSCLI:
    IMG_ID=`aws ec2 describe-images --filters "Name=name,Values=catapult-devops-ecs-container-*" \
        "Name=tag:Status,Values=Active" --region us-east-1 | jq '.Images[].ImageId' \
        | sed 's/\"//g' | tr -d '\040\011\012\015'`
    name is atapult-devops-ecs-container-*
    tag is Status:Active
    That's not going to change...
    """
    ec2 = boto3.resource('ec2', region_name=region)
    image = list(ec2.images.filter(Filters=[{'Name':'name', 'Values':['catapult-devops-ecs-container-*']},{'Name':'tag:Status', 'Values':['Active']}]).all())
    ami = str(image[0]) # cast first element of returned list (only element with filter) as string
    ami_arr = ami.split('=') # split the string on '='
    ami_id = ami_arr[1] # take the ami id with quote and ')'
    ami_id = ami_id.replace("')", '')
    ami_id = ami_id.replace("'", '')
    return(ami_id)

This took a bit to work through – I had to discover how to read the boto3 docs, which are complete but not immediately intuitive. And find enough example to decipher the docs long enough to get it to work. Eventually I got


ami-09019dc02aa3167ba

The function after that would be create_cluster(), then create_instances(), then create_task_def(), create_elb(), create_listener(), create_target_group(). I’ve got create_instance() and create_cluster(). Once you get the boto3 docs, the implementation is very reusable, way more so than bash…

— doug

P.S.: I’ve used the Umbrella Corp logo tongue-in-cheek for devops projects because, devops really is the stuff of life itself…

Points from Experience

dougmunsinger — Wed, 08 May 2019 17:12:38 +0000

I just had a conversation with a college student, first year, studying computer and software engineering and looking for a broad overview of technology and the field’s past and future. Out of that conversation I’m highlighting some stable pieces of data that have held true over time.

automation – automate as you go. There’s a lot of up-front work that adds perhaps 66% to the tasks as you begin this but it catches up and makes everything much faster in the long run. In addition, you have complete repeatability and documentation of your environs.
for example, this morning my boss asked my to deploy a docker container. I had just done the work to port over a slightly different docker build over the last several days. Porting this particular container over took 10 minutes.
The first one took 3 days. So there’s that.
Maker Time vs. Manager Time. Meetings can be scheduled before 11 AM by any manager without consulting, but any meetings after that have to be agreed to be the Maker and they can decline, is a solution I found at one company. When you are working in code you hold a mental model of the construct in your head – the quick meeting disrupts that, costing 20 minutes or even hours to rebuild before forward progress resumes. Managers miss this, constantly.
Language patterns – for loop, while loop, arrays, data structures, if-elif-else, dictionaries – are way more important and useful than the individual language syntax. A conceptual understanding of the structure of code, rather than the nuances of a particular language. Additionally expect AI to change languages and IDEs and coding trm=emendously in the next ten years. Much of the details of syntax and language is not shown by color emphasis in IDE’s – more direct syntax handling would be completely possible with AI. And I expect an AI language, a coding of code.
Keep your resume up to date and use it to gauge how well your career is going – if you aren’t doing anything in your current job that add interesting accomplishments to the resume, reach for more interesting work where you are or find another job.
About the last three years of your experience will be currently useful to you. Everything before that will have changed…
Document your code as you go with detailed comments on what you were thinking, where that obscure variable derived from.
Infrastructure as code – the value in this is repeatability. And written document of exactly how that server or application was constructed.
Humans are terrible at repetitive tasks. Most are anyway, you’ll find the occasional (really valuable) person who can be counted on to do the repetitive tasks like a metronome, but those people are seriously rare. Automated systems avoid that.
Managing your manager and human interactions that wrap around your job are going to be critical for you.
Fan that spark of excitement. When you find that in what you are doing cherish it and make note of it and aim to make that as much a part of your career as is possible.

— doug

SSH Port Forwarding or Ad Hoc VPN

dougmunsinger — Fri, 03 May 2019 17:46:31 +0000

ssh -i localkey -L local_port:localhost:remote_port user@ip

I had to look this up again. I haven’t had to use this in a while, maybe five years? I also wanted multiple ports forwarded, and that works like

ssh -i localkey -L local_port:localhost:remote_port -L local_port:localhost:remote_port user@ip

And… jenkins host inside remote VPC:

ssh -i catalyst -L 8080:localhost:18080 centos@bastion-host-ip

So, from my computer a connection to localhost:8080 forwards to my bastion host at 18080…

Then the bastion host forwards 18080 locally to my private addressed jenkins host at 8080…

ssh -i .ssh/catalyst -L 18080:localhost:8080 centos@private-addressing-for-jenkins-host

and…

Simple. Workable. When networking and firewalls and access lists and routing, oh my, stand in the way and need sorting, there’s alway a bastion host and ssh…

— doug

Blue Green w/S3, Cloudfront, Route53

dougmunsinger — Fri, 03 May 2019 16:31:08 +0000

I tend to code and architect devops with an eye toward NOT being locked into any particular cloud or service.

Netsaint -> Nagios -> Icinga

Hudson -> Jenkins

VMWare -> Vagrant -> Docker -> Kubernetes -> ECS

Everything changes. That ideal cloud you are moving on to right now – will change in five years, probably enough in ten years that part of the reason you will stay with it is simply the code base already written that works with that solution. In other words, inertia. That drops the agility of your company to compete. So, don’t.

If there is an alternate, one that doesn’t lock you into that environment, do it. Between RDS and running mysql/MariaDB on instances, unless there is a compelling case for RDS, run mysql/MariaDB. Between cloud services and running it in the cloud but under your design and control, where possible run it yourself. Unless there is a compelling reason that offsets the future pain of migration when it becomes necessary, don’t move toward lock in.

That said, I work right now for a company that is all in on AWS. (For now.) And one of the products is a javascript app deployed to a static website on s3, pushed out to Cloudfront and then made accessible through route 53. Route 53 also makes a reasonable blue-green deployment possible, in that route 53 can directly alias an AWS resource as the target of an A record. The switch in target aliases in route 53 causes an instant change within AWS, and from there DNS propagation flows out to complete the change.

You can do Lambda@Edge, which adds code and intelligence to the Cloudfront piece itself, but that’s brand new (therefore suspect) (shiny, but suspect). More code, more lock in. So – simpler, and workable is fine for this product.

It starts with a pipeline job in Jenkins which builds the javascript and then pushes it out to the inactive bucket of a blue-green pair. I have 6 of these – Dev blue-green, Prod blue-green, QA blue-green. Each bucket gets tagged, “active” (serving code and the app), “inactive” (NOT serving code, but ready to deploy as reversion), and “hold” (deployed, not yet ever active). The initial tag is “inactive”.

A deploy is made by

build the new code
push that code out to the inactive bucket
tag that bucket “hold”
remove the Alias CNAME from the blue (active) bucket Cloudfront distribution
add that Alias CNAME to the green (hold) Cloudfront distribution
point the Target Alias for the A record in route 53 from the blue Cloudfront url to the green cloudfront url
green bucket goes to tag “active”, blue goes to “inactive”
verify deploy

All of this is done through AWS cli and bash scripts. I keep promising myself to write in something other than bash, but for this exact manipulation, especially while proving it out, bash rocks. It is native to Linux, accessible, directly runs the AWS commands exactly as they would execute on the command line. It’s ugly – but so was perl and at one point much of the world wide web was running on perl. So deal.

Pushing code out to s3…

#! /bin/bash

echo "dev buckets..."
DEVGREEN=`aws s3api get-bucket-tagging --bucket dashboard-dev-green.fqdn --output text | grep code_state | awk '{ print \$3 }'`
DEVBLUE=`aws s3api get-bucket-tagging --bucket dashboard-dev-blue.fqdn --output text | grep code_state | awk '{ print \$3 }'`
QAGREEN=`aws s3api get-bucket-tagging --bucket dashboard-qa-green.fqdn --output text | grep code_state | awk '{ print \$3 }'`
QABLUE=`aws s3api get-bucket-tagging --bucket dashboard-qa-blue.fqdn --output text | grep code_state | awk '{ print \$3 }'`
PRODGREEN=`aws s3api get-bucket-tagging --bucket dashboard-green.fqdn --output text | grep code_state | awk '{ print \$3 }'`
PRODBLUE=`aws s3api get-bucket-tagging --bucket dashboard-blue.fqdn --output text | grep code_state | awk '{ print \$3 }'`
DATE=`date +%Y%m%d%H%M%S`
if [[ ($DEVBLUE == 'inactive') || ($DEVBLUE == 'hold') ]]; then
    echo "deploying to DEVBLUE bucket..."
    aws s3 rm s3://dashboard-dev-blue.fqdn --recursive
    aws s3 cp ./build-dev s3://dashboard-dev-blue.fqdn --recursive
    aws s3api put-bucket-tagging --bucket dashboard-dev-blue.fqdn --tagging 'TagSet=[{Key=code_state,Value=hold}]'
elif [[ ($DEVGREEN == 'inactive') || ($DEVGREEN == 'hold') ]]; then
    echo "deploying to DEVGREEN bucket..."
    aws s3 rm s3://dashboard-dev-green.fqdn --recursive
    aws s3 cp ./build-dev s3://dashboard-dev-green.fqdn --recursive
    aws s3api put-bucket-tagging --bucket dashboard-dev-green.fqdn --tagging 'TagSet=[{Key=code_state,Value=hold}]'
else
    echo "neither DEVGREEN nor DEVBLUE are  deployable..."
fi
if [[ ($QABLUE == 'inactive') || ($QABLUE == 'hold') ]]; then
    echo "deploying to QABLUE bucket..."
    aws s3 rm s3://dashboard-qa-blue.fqdn --recursive
    aws s3 cp ./build-qa s3://dashboard-qa-blue.fqdn --recursive
    aws s3api put-bucket-tagging --bucket dashboard-qa-blue.fqdn --tagging 'TagSet=[{Key=code_state,Value=hold}]'
elif [[ ($QAGREEN == 'inactive') || ($QAGREEN == 'hold') ]]; then
    echo "deploying to QAGREEN bucket..."
    aws s3 rm s3://dashboard-qa-green.fqdn --recursive
    aws s3 cp ./build-dev s3://dashboard-qa-green.fqdn --recursive
    aws s3api put-bucket-tagging --bucket dashboard-qa-green.fqdn --tagging 'TagSet=[{Key=code_state,Value=hold}]'
else
    echo "neither QAGREEN nor QABLUE are  deployable..."
fi
if [[ ($PRODBLUE == 'inactive') || ($PRODBLUE == 'hold') ]]; then
    echo "deploying to PRODBLUE bucket..."
    aws s3 rm s3://dashboard-blue.fqdn --recursive
    aws s3 cp ./build-dev s3://dashboard-blue.fqdn --recursive
    aws s3api put-bucket-tagging --bucket dashboard-blue.fqdn --tagging 'TagSet=[{Key=code_state,Value=hold}]'
elif [[ ($PRODGREEN == 'inactive') || ($PRODGREEN == 'hold') ]]; then
    echo "deploying to PRODGREEN bucket..."
    aws s3 rm s3://dashboard-green.fqdn --recursive
    aws s3 cp ./build-dev s3://dashboard-green.fqdn --recursive
    aws s3api put-bucket-tagging --bucket dashboard-green.fqdn --tagging 'TagSet=[{Key=code_state,Value=hold}]'
else
    echo "neither PRODGREEN nor PRODBLUE are  deployable..."
fi

Deploy…

#! /bin/bash

## DEPLOY DEV BLUE
## - set green cloudfront without alias
## - set blue cloudfront with alias
## - set route53 target alias pointing to blue, thus deploy blue

DEVBLUE='Blue Cloudfront ID for Dev'
DEVGREEN='Green Cloudfront ID for Dev'

# ETags
DEVBLUEETAG=`aws cloudfront get-distribution-config --id "${DEVBLUE}" | jq -r '.ETag'`
echo "blue ETag:  ${DEVBLUEETAG}"
DEVGREENETAG=`aws cloudfront get-distribution-config --id "${DEVGREEN}" | jq -r '.ETag'`
echo "green tag:  ${DEVGREENETAG}"

# green remove alias
aws cloudfront update-distribution --id "${DEVGREEN}" --distribution-config file://cicd/aws/cloudfront/dashboard_dev_cloudfront_green_without_alias.json --if-match "${DEVGREENETAG}"

# blue, add in alias
aws cloudfront update-distribution --id "${DEVBLUE}" --distribution-config file://cicd/aws/cloudfront/dashboard_dev_cloudfront_blue_with_alias.json --if-match "${DEVBLUEETAG}"

# point alias to blue resource

# get xone id for catapultsports.info
ZONEID=`aws route53  list-hosted-zones-by-name --dns-name catapultsports.info | jq -r '.HostedZones[0].Id' | awk -F'/' '{ print $3 }'`

# point route53 alias target to blue resource
aws route53 change-resource-record-sets --hosted-zone-id ${ZONEID} --change-batch file://cicd/aws/route53/dashboard_dev_route53_point_target_to_blue.json

The cloudfront json file. This is best gather by running “aws cloudfront get-distribution-config –id …

{
    "Comment": "",
    "CacheBehaviors": {
        "Quantity": 0
    },
    "IsIPV6Enabled": true,
    "Logging": {
        "Bucket": "",
        "Prefix": "",
        "Enabled": false,
        "IncludeCookies": false
    },
    "WebACLId": "",
    "Origins": {
        "Items": [
            {
                "S3OriginConfig": {
                    "OriginAccessIdentity": ""
                },
                "OriginPath": "",
                "CustomHeaders": {
                    "Quantity": 0
                },
                "Id": "S3-dashboard-dev-blue.fqdn",
                "DomainName": "dashboard-dev-blue.fqdn.s3.amazonaws.com"
            }
        ],
        "Quantity": 1
    },
    "DefaultRootObject": "",
    "PriceClass": "PriceClass_All",
    "Enabled": true,
    "DefaultCacheBehavior": {
        "TrustedSigners": {
            "Enabled": false,
            "Quantity": 0
        },
        "LambdaFunctionAssociations": {
            "Quantity": 0
        },
        "TargetOriginId": "S3-dashboard-dev-blue.fqdn",
        "ViewerProtocolPolicy": "allow-all",
        "ForwardedValues": {
            "Headers": {
                "Quantity": 0
            },
            "Cookies": {
                "Forward": "none"
            },
            "QueryStringCacheKeys": {
                "Quantity": 0
            },
            "QueryString": false
        },
        "MaxTTL": 31536000,
        "SmoothStreaming": false,
        "DefaultTTL": 86400,
        "AllowedMethods": {
            "Items": [
                "HEAD",
                "GET"
            ],
            "CachedMethods": {
                "Items": [
                    "HEAD",
                    "GET"
                ],
                "Quantity": 2
            },
            "Quantity": 2
        },
        "MinTTL": 0,
        "Compress": false
    },
    "CallerReference": "somenumber",
    "ViewerCertificate": {
        "SSLSupportMethod": "sni-only",
        "ACMCertificateArn": "arn:aws:acm:region:certificate-in-aws",
        "MinimumProtocolVersion": "TLSv1.1_2016",
        "Certificate": "arn:aws:acm:region:certificate-in-aws",
        "CertificateSource": "acm"
    },
    "CustomErrorResponses": {
        "Quantity": 0
    },
    "HttpVersion": "http2",
    "Restrictions": {
        "GeoRestriction": {
            "RestrictionType": "none",
            "Quantity": 0
        }
    },
    "Aliases": {
        "Items": [
            "Arecord in route 53"
        ],
        "Quantity": 1
    }
}

and Route 53 json…

{
     "Comment": "Creating Alias resource record  in Route 53",
     "Changes": [{
                "Action": "UPSERT",
                "ResourceRecordSet": {
                            "Name": "your-a-record-name.",
                            "Type": "A",
                            "AliasTarget":{
                                    "HostedZoneId": "zoneid",
                                    "DNSName": "cloudfront.url.aws",
                                    "EvaluateTargetHealth": false
                              }}
                          }]
}

— doug

nbsp;

Reload Rather than Restart Jenkins (Updated)

dougmunsinger — Fri, 03 May 2019 15:19:41 +0000

There is a method in the GUI for Jenkins that tells the Jenkins java process to reload its config from disk. From outside the GUI, where devops and automation live, you can do the same thing through the jenkins-cli.jar. This needed a script to craft the other pieces needed, retrieving a key from vault in order to authenticate.

#! /bin/bash

# script to use the jenkins-cli to reload the config from disk
#/etc/alternatives/java -jar /opt/jenkins-cli.jar -s http://localhost:8080 reload-configuration --username devops --password `cat /home/ec2-user/.ssh/devopsUserActual`


if [[ -f /home/ec2-user/.ssh/devopsUserActual ]]; then
  /etc/alternatives/java -jar /opt/jenkins-cli.jar -s http://localhost:8080 reload-configuration --username user --password `cat /home/ec2-user/.ssh/someUserActual`
else
  # jenkins cli requires the base actual devops password be available to restart jenkins
  if [[ `$VAULT read -field passwd secret/keys/jenkins/someUserActual` ]]; then
    echo "...successflly retrieved password string from vault..."
    DEVOPSACTUAL=`$VAULT read -field passwd secret/keys/jenkins/someUserActual`
    /etc/alternatives/java -jar /opt/jenkins-cli.jar -s http://localhost:8080 reload-configuration --username user --password ${DEVOPSACTUAL}
  else
     echo "ERROR:: Failed to get passwd from vault: secret/keys/jenkins/someUserActual, passwd field"
    echo "Exiting..."
  fi
fi

This replaced ansible calls for “restart”. Instead I use a shell call to this script. Speeded up execution tremendously.

— doug

UPDATE 20190503

I reworked this because – Jenkins.

Jenkins keeps updating (and significantly changing) their security model. LTS (long term support) is basically abandoned, because each LTS immediately gets flagged by Jenkins as insecure because of the next bug they fix, which demands a re-architecture of the product and disrupts the CRAP out of supporting this .

The reload had to be reworked, recently.

#! /bin/bash

#/etc/alternatives/java -jar /opt/jenkins-cli.jar -s http://localhost:8080 reload-configuration --username devops --password `cat /home/ec2-user/.ssh/devopsUserActual`

if [[ -f /home/centos/.ssh/devopsActual ]]; then
  /etc/alternatives/java -jar /opt/jenkins/jenkins-cli.jar -s http://localhost:8080 -auth devops:`cat /home/centos/.ssh/devopsActual` reload-configuration
else
    echo "ERROR:: Failed to get passwd from devopsActual"
    echo "Exiting..."
fi

The significant change is:

/etc/alternatives/java -jar /opt/jenkins/jenkins-cli.jar -s http://localhost:8080 -auth devops:`cat /home/centos/.ssh/devopsActual` reload-configuration

the previous

--username user --password ${DEVOPSACTUAL}

no longer worked, but the -auth construct still does. For now. Sigh.

Really tempted to (1) replace jenkins with a golang server listening for the notifyCommit… or (2) fork Jenkins and isolate it behind a comprehensive auth and firewall and drop the security model from inside jenkins cause it’s seriously crap, guys…

— doug

initPipeline_JenkinsPlugin, Open-Sourced

dougmunsinger — Tue, 23 Apr 2019 15:40:24 +0000

I wrote a jenkins plugin while I was idle between Oath and my current gig.

This plugin is a simplification of Oath’s (AOL’s) CICD Discover plugin. Re-written from scratch. Instead of crafting java code in the plugin, I walked it back to its origins, where it sends execution to an external executable.

The impetus to this plugin was laziness. Seriously. I worked alongside “Release Engineers”, back not long ago when that and “QA Engineers” was a thing. A good part of their job was to hand hold developers and make and configure Jenkins jobs to work with their code. I wanted to avoid that by having Jenkins configure pipeline jobs for itself, with no DevOps (the replacement for “Release Engineering” and “QA Engineering”, at least in part) involvement in manual configuration.

The git-plugin (written in java) had to have the code to achieve that somewhere – the plugin takes in a /git/notifyCommit?url= from a repo server, then checks its build history and determines what to build directly from the repo, then kicks off the build.

In GitStatus.java I found two blocks of code, !scmFound and !urlFound, where the plugin dropped the notifyCommit onto the floor because Jenkins did not have the job configured.

The point where git-plugin listens is extendable in Jenkins. So I stole much of the framework for the GitStatus.java (actually, under an MIT license, I can’t actually misuse it, really…) and with IntelliJ IDE assistance in finding the errors in my java syntax (I was removing ALOT of code and it kept breaking the java compile and assemble…) I crafted a POC that simply called an outside bash script to catch the dropped-on-the-floor notifyCommit and do something useful with it. Initially it called a bash script. Eventually most of the work was moved into the plugin and written in java.

I broke it back to a simpler version for the flexibility that allows. If you don’t like the exact process I assume in the script I call, write your own.

AOL (for whom the original plugin code was written) agreed to open-source the plugin and made some motions toward it, but as far as I can tell never released it. Oath didn’t even try to do so.

So, as a newly written work, initpipeline gets open-sourced. Under the original MIT license.

initPipeline_JenkinsPlugin.

— doug