Nebula-in-a-Box actually works now.
The latest problems were in ssl certificates, vault, consul, DNS, internal addressing, tuning consul to run as basically a single master instance (not designed for that).
The Nebula-in-a-Box orion instance comes up now with consul fully functioning as service discovery and DNS. It registers and interacts with the local vault.
The last issue was from earlier scripts registering and enabling vault on the agents. I did not want to craft a special agent AMI for this, I wanted Nebula-in-a-Box to just use the off-the-rack AMIs we deploy as agents for Jenkins in the distributed architecture. The difference is that the original scripts register and set the instance to communicate with the clustered vault+consul cname. The overrides needed to register the instance with vault on the controller itself, using consul service discovery to have the SSL certs for vault line up with the naming, then drop the appropriate files into place to continue using vault. And I needed a way to override the recursors directly in /etc/sysconfig/consul, rather than be forced to take them from DHCP.
The first pass for /etc/sysconfig/consul had an ansible template that pulls the name servers out of the DHCP AWS metadata and dropped them in. The override allows setting “orion_recursors:” as one or more IP addresses. The initial standalone role instantiates the orion_recursors.yml file, creating it using a template. That file is dropped into /etc/orion and once that happens the path for supplying recursors changes.
With that file, the orion_recursors values are populated in for both the agent and controller /etc/sysconfig/consul files, and the original plays that harvested the name servers from DHCP are still present but don’t run.
The Nebula-in-a-Box finally built the example-cicd-directory job (our base sample job) and then proceeded to build itself (building the Nebula-in-a-Box pipeline) flawlessly.
I still need to upgrade consul and vault, and build a docker and virtual box images for devs to play with, then create a testing framework and get test coverage in place so this thing is supportable. And strip out the company specific and AWS specific stuff, make it more modular and configurable for various environment rather than just Oath’s. And open source it. Which we have final approval for.
— doug