Vault and Self-signed Certs

In a distributed AWS cloud environment, SSL certs live on the ELB instances and are signed by known CA’s. Backend encryption using self-signed certs is seamless.

As soon as Jenkins, consul+vault and the nebula utilities API are brought together onto the same box (Nebula-in-a-Box), and are moved to using consul service discovery, SSL naming and self-signed certs have to be untangled and the certificate signing itself conserved on the controller instance and propagated out to the agent instance.

On Nebula-in-a-Box, /etc/consul.conf has a couple of keys, one that encrypts internal communications and a second determining who to talk to. On top of that vault can use SSL for any communications to and from itself. I have ansible plays that generate the token value for access, and the encryption key for internal communication, at boot of the instance, and then additional scripting that aligns the agent instances with these values on the fly when it is brought up.

I set the datacenter and acl_datacenter strings to “orion” in /etc/consul.conf.


[root@ip-10-112-4-3 ec2-user]# cat /etc/consul.conf
{
  "ui": true,
  "disable_remote_exec": true,
  "domain": "consul.",
  "data_dir": "/opt/consul/data",
  "log_level": "INFO",
  "server": true,
  "client_addr": "0.0.0.0",
  "bind_addr": "0.0.0.0",
  "datacenter": "orion",
  "encrypt": "Unq5SwSM1pQtg3o+SWhaQAyZrXNIjaad88/pxjaadBo=",
  "rejoin_after_leave": true,
  "leave_on_terminate": true,
  "acl_datacenter": "orion",
  "acl_default_policy": "allow",
  "acl_down_policy": "allow",
  "acl_master_token": "95f67472-4ad9-0000-b627-a826cb66f568",
  "ports": { "dns": 53, "https": 8543 },
  "cert_file": "/etc/pki/tls/certs/consul_cert.pem",
  "key_file": "/etc/pki/tls/private/consul_key.pem"
}

This directs consul service discovery to construct urls for the services defined for it as


[root@ip-10-113-0-87 ec2-user]# nslookup vault
Server:		127.0.0.1
Address:	127.0.0.1#53

Name:	vault.service.orion.consul
Address: 10.113.0.87

You have new mail in /var/spool/mail/ec2-user
[root@ip-10-113-0-87 ec2-user]#

Next is the /etc/pki/tls/openssl.cnf file. This is templated in ansible to capture the instance ip address, and also to add *.service.orion.consul as a wildcard for the cert.


[ alt_names ]
IP.1 = {{ ansible_ec2_local_ipv4 }}
IP.2 = 127.0.0.1
DNS.1 = {{ ansible_ec2_hostname }}
DNS.2 = localhost
DNS.3 = *.service.orion.consul

[ v3_ca ]
subjectAltName = @alt_names

Then at boot the cert itself is generated. The cert is copied over to the /var/lib/jenkins/copy_to_slave directory to be dropped onto new agent instances and then deployed on the agent by a check_slave.sh script that reconfigures the agent to work with Nebula-in-a-Box consul service discovery, consul cluster and DNS. This means the builds’ Jenkinsfiles can now refer to the consul service urls, e.g., consul.service.orion.consul and vault.service.orion.consul within the Nebula-in-a-Box builds.

When the self-signed cert is generated, it signs itself. From that point it is unknowable outside of the controller, and then on the agents as they are brought up. Vault needs to be made aware that the cert has signed itself. At the same time we can now define the $VAULT_ADDR environment variable as “vault.service.orion.consul”.

On distributed boxes there is a script /etc/profile.d/vault.sh that establishes environmental variables for using vault. The Nebula-in-a-Box builds replace that file, setting VAULT_ADDR to “https://vault.service.orion.consul:8200″ and setting VAULT_CACERT=”path to consul cert generated and used to sign itself”

Now we have consul service discovery domain name lined up with the generated self-signed certificate, lined up with the url vault will be addressed at, and vault made aware that the cert has signed itself. This makes consul service discovery useful, finally. It also may be something we move into the distributed architecture, because it makes service addressing and registration thoroughly fluid.

— doug