Terraform + Ansible = GCP Automation Goodness
At Arctiq we have talked many times about how sweet the compliment between Ansible and Terraform is. In this blog I am exploring a hands on example of how Ansible and Terraform can simplify workflows for sys admins, developers and users within an increasingly complex ecosystem of multi and hybrid cloud infrastructures.
It is no secret that in 2019, organizations are looking to further diversify their infrastructure by having no stake entirely in one bask - be it private cloud, public cloud or traditional on premise infrastructure. Part of this drive is the migration towards an application centric view of the world where infrastructure is purely agnostic - we neither care where/what it is or how it got there, we just care that it is there. In order for this to be manageable, organizations need the ability to have consistency at scale, efficiency at scale and portability in order to manage regardless of there ‘where’.
Enter Infrastructure as Code (IaC). Though a bit of a tangent from the purpose of this blog, in itself, it is a demo of IaC. Not necessarily a new concept, tools like Ansible, Terraform, Puppet, Chef have all been around for awhile now. The core idea in this blog however, is that looking at a tools strengths and admitting their weakness can lead to really great innovation in simplifying workflows that satisfy modern IT needs.
So let’s take a look.
Use automation to deploy AWX (upstream Ansible Tower) into Google Cloud Platform. This is a fairly basic use case, but I’ll come back to why which software package we selected is not important. I have selected Terraform and Ansible as the automation tools purely do to popularity and the fact that I work with them almost daily lately.
Options to Consider
Option 1: Manually
This is completely ridiculous in 2019. The thought of clicking through cludgey UI’s to configure a VM instance and then logging into the system via SSH is just not a thing any self respecting sys admin should be willing to do anywhere outside of a sandbox environment. This is not scaleable, efficient or portable.
Option 2: Manual Provision + Bash Automation
GCP has the idea of a ‘startup-script-url’ that can be configured as meta-data on a VM instance. The idea is you would still use the UI to launch a VM instance, and set the meta-data accordingly. On boot it sucks down the bash script from a repo and runs the rest. This is reasonably consistent from a configuration perspective, there is room for error on the VM spec, and in no way is this scaleable, efficient or portable. Unfortunately, some traditional sys admins may feel a strong temptation to use bash as it is a go to for many.
Option 3: Entirely in either Ansible or Terraform
This option has legs and is likely where many would land. Both tools offer consistency at scale, efficiency at scale and are moderately portable. Both have great support for all major cloud providers and on-premise infrastructure. In my opinion, both have nuances complicating an end to end flow. Personally I find Ansible is fantastic configuration tool, but the steps needed to get it to work with a cloud platform API are a bit annoying making the rework slightly less portable. Terraform on the otherhand I find extremely quick to configure a cloud provider and get the instances up and running, but the post provisioning configuration kind of feels like I should just be logging in and running a bash script.
Option 4: Terraform for provisioning, Ansible for configuration
Ok, if you read the title of this blog, you knew this is where we landed and I strategically left the best for last. One of the reasons I like this approach is there is a very clear delineation between provisioner and configuration. This makes your configuration management completely portable, whether it is GCP, AWS, Azure, on-premise, it does not matter. You only need to ensure you have modularized your Terraform appropriately and you have a 100% portable solution. This solution quickly satisfies consistency at scale, efficiency at scale and portability.
An interesting by-product of this approach is a very simplified workflow. When using Terraform modules, your main.tf becomes a very clean, single source of user defined variables with the work of provisioning and configuration completely abstracted. Let’s take a look at a sample of how we will configure the work flow to deploy AWX into GCP:
Once main.tf has become configured, this example is entirely initiated through terraform, there are no other files or scripts a user has to modify or run.
$ terraform init
$ terraform plan
Terraform plan tells us what is going to be built. In this case there are 2 resources being provisioned - a firewall rule attached to the default network and a compute engine VM named awx01. Much of the output has been redacted in this case, but the output shown is what we care about.
$ terraform apply
The entire output from terraform apply is too verbose to be of value in this blog, but consider this. A single workflow completed the following tasks in 5 minutes and 35 seconds:
- VM Provisioned
- Base OS Updated and pre-requisite packages installed
- Database provisioned
- Software installed
- Software configured
- Firewall ports opened on the network
In another life, I know in an enterprise with a more traditional IT department structure, achieving this type of installation would take weeks,if not months. A couple of important points of note here - no matter how many times this flow is executed, the end result is the same. More importantly however, the number of systems this applies to is irrelevant. Whether it is 1 server, 10, 100, 1000 - the workflow is going to consistently deliver an end result in approximately 5 and a half minutes.
But Wait! There’s More!
Up to this point we have talked about a simplified workflow and shown the simple 3 steps one needs to run through for this demo to roll out. One might have noticed there has been little to no mention of ansible beyond the fact I feel it is better at configuration management. Let’s take a look at that.
There are a couple of ways this can be handled. In the case of this demo, I used Terraform’s remote-exec provisioner to do a bit of bootstrapping for ansible + git and then kicking off an ansible playbook.
Depending on your needs, one might consider using the local-exec provisioner instead. This would remove the requirement of bootstrapping ansible and git on the remote system.
Thinking back to my previous statements that an organization needs consistency and efficiency at scale as well as portability, lets see how this solution of pairing Terraform and Ansible holds up
1) Is it efficient? 1,10,100,1000 servers with installed software in ~5m30 - Yes
2) Is it consistent? The only changeable pieces are project specific, the actual build remains the same - Yes
3) Is it portable? Some rework exists on the terraform to ensure you are using the correct providers. This demo was specific to GCP, however Terraform has providers for many cloud providers and on-prem technologies
Check out the demo in action on the Arctiq Team Youtube channel.