Author: Marc LeBlanc


Terraform + Ansible = GCP Automation Goodness

At Arctiq we have talked many times about how sweet the compliment between Ansible and Terraform is. In this blog I am exploring a hands on example of how Ansible and Terraform can simplify workflows for sys admins, developers and users within an increasingly complex ecosystem of multi and hybrid cloud infrastructures.

It is no secret that in 2019, organizations are looking to further diversify their infrastructure by having no stake entirely in one bask - be it private cloud, public cloud or traditional on premise infrastructure. Part of this drive is the migration towards an application centric view of the world where infrastructure is purely agnostic - we neither care where/what it is or how it got there, we just care that it is there. In order for this to be manageable, organizations need the ability to have consistency at scale, efficiency at scale and portability in order to manage regardless of there ‘where’.

Enter Infrastructure as Code (IaC). Though a bit of a tangent from the purpose of this blog, in itself, it is a demo of IaC. Not necessarily a new concept, tools like Ansible, Terraform, Puppet, Chef have all been around for awhile now. The core idea in this blog however, is that looking at a tools strengths and admitting their weakness can lead to really great innovation in simplifying workflows that satisfy modern IT needs.

So let’s take a look.

The Scenario

Use automation to deploy AWX (upstream Ansible Tower) into Google Cloud Platform. This is a fairly basic use case, but I’ll come back to why which software package we selected is not important. I have selected Terraform and Ansible as the automation tools purely do to popularity and the fact that I work with them almost daily lately.

Options to Consider

Option 1: Manually

This is completely ridiculous in 2019. The thought of clicking through cludgey UI’s to configure a VM instance and then logging into the system via SSH is just not a thing any self respecting sys admin should be willing to do anywhere outside of a sandbox environment. This is not scaleable, efficient or portable.

Option 2: Manual Provision + Bash Automation

GCP has the idea of a ‘startup-script-url’ that can be configured as meta-data on a VM instance. The idea is you would still use the UI to launch a VM instance, and set the meta-data accordingly. On boot it sucks down the bash script from a repo and runs the rest. This is reasonably consistent from a configuration perspective, there is room for error on the VM spec, and in no way is this scaleable, efficient or portable. Unfortunately, some traditional sys admins may feel a strong temptation to use bash as it is a go to for many.

Option 3: Entirely in either Ansible or Terraform

This option has legs and is likely where many would land. Both tools offer consistency at scale, efficiency at scale and are moderately portable. Both have great support for all major cloud providers and on-premise infrastructure. In my opinion, both have nuances complicating an end to end flow. Personally I find Ansible is fantastic configuration tool, but the steps needed to get it to work with a cloud platform API are a bit annoying making the rework slightly less portable. Terraform on the otherhand I find extremely quick to configure a cloud provider and get the instances up and running, but the post provisioning configuration kind of feels like I should just be logging in and running a bash script.

Option 4: Terraform for provisioning, Ansible for configuration

Ok, if you read the title of this blog, you knew this is where we landed and I strategically left the best for last. One of the reasons I like this approach is there is a very clear delineation between provisioner and configuration. This makes your configuration management completely portable, whether it is GCP, AWS, Azure, on-premise, it does not matter. You only need to ensure you have modularized your Terraform appropriately and you have a 100% portable solution. This solution quickly satisfies consistency at scale, efficiency at scale and portability.

Simplified Workflow

An interesting by-product of this approach is a very simplified workflow. When using Terraform modules, your main.tf becomes a very clean, single source of user defined variables with the work of provisioning and configuration completely abstracted. Let’s take a look at a sample of how we will configure the work flow to deploy AWX into GCP:

main.tf

module "gcp" {
     source                 = "./modules/gcp"
     awx_admin              = "admin"
     awx_admin_pass         = "supersecretpassword"
     gcp_json               = "~/projects/awx-ansible-setup/secrets/mleblanc-ce-prov.json"
     gcp_project_id         = "solar-cab-231515"
     gcp_region             = "northamerica-northeast1"
     gcp_zone               = "a"
     gcp_instance_name      = "awx01"
     gcp_instance_os        = "centos-cloud/centos-7"
     ssh_key_path           = "~/.ssh/"
     ssh_key_pub            = "gcp_rsa.pub"
     ssh_key_priv           = "gcp_rsa"
     ssh_user               = "mleblanc"
  }
    
  # source             = module path. Do not Change
  # gcp_json           = GCP Service Account Key (path + filename) 
  # gcp_project_id     = GCP Project ID
  # gcp_region         = GCP Region for instances ie northamerica-northeast
  # gcp_zone           = GCP Zone within the region ie a,b,c
  # gcp_instance_name  = The instance name as it will appear in GCP https://cloud.google.com/compute/docs/regions-zones/
  # gcp_instance_os    = The OS Image to use - public images https://cloud.google.com/compute/docs/images#os-compute-support
  # ssh_key_path       = Path to SSH key pair to use
  # ssh_key_pub        = Public Key filename to be provisioned to the instance
  # ssh_key_priv       = Private Key filename 
  # ssh_user           = Username for ssh

Once main.tf has become configured, this example is entirely initiated through terraform, there are no other files or scripts a user has to modify or run.

$ terraform init

        Initializing modules...
        - module.gcp
          Getting source "./modules/gcp"
        
        Initializing provider plugins...
        - Checking for available provider plugins on https://releases.hashicorp.com...
        - Downloading plugin for provider "google" (2.1.0)...
        
        The following providers do not have any version constraints in configuration,
        so the latest version was installed.
        
        To prevent automatic upgrades to new major versions that may contain breaking
        changes, it is recommended to add version = "..." constraints to the
        corresponding provider blocks in configuration, with the constraint strings
        suggested below.
        
        * provider.google: version = "~> 2.1"
        
        Terraform has been successfully initialized!
        
        You may now begin working with Terraform. Try running "terraform plan" to see
        any changes that are required for your infrastructure. All Terraform commands
        should now work.
        
        If you ever set or change modules or backend configuration for Terraform,
        rerun this command to reinitialize your working directory. If you forget, other
        commands will detect it and remind you to do so if necessary.

$ terraform plan

        Refreshing Terraform state in-memory prior to plan...
        The refreshed state will be used to calculate this plan, but will not be
        persisted to local or remote state storage.
        
        
        ------------------------------------------------------------------------
        
        An execution plan has been generated and is shown below.
        Resource actions are indicated with the following symbols:
          + create
        
        Terraform will perform the following actions:
        
          + module.gcp.google_compute_firewall.default
          + module.gcp.google_compute_instance.awx01
        
        
        Plan: 2 to add, 0 to change, 0 to destroy.
        
        ------------------------------------------------------------------------

        
        Note: You didn't specify an "-out" parameter to save this plan, so Terraform
        can't guarantee that exactly these actions will be performed if
        "terraform apply" is subsequently run.

Terraform plan tells us what is going to be built. In this case there are 2 resources being provisioned - a firewall rule attached to the default network and a compute engine VM named awx01. Much of the output has been redacted in this case, but the output shown is what we care about.

$ terraform apply

        module.gcp.google_compute_instance.awx01: Creation complete after 5m35s (ID: awx01)
        
        Apply complete! Resources: 2 added, 0 changed, 0 destroyed.```

The entire output from terraform apply is too verbose to be of value in this blog, but consider this. A single workflow completed the following tasks in 5 minutes and 35 seconds:

  • VM Provisioned
  • Base OS Updated and pre-requisite packages installed
  • Database provisioned
  • Software installed
  • Software configured
  • Firewall ports opened on the network

In another life, I know in an enterprise with a more traditional IT department structure, achieving this type of installation would take weeks,if not months. A couple of important points of note here - no matter how many times this flow is executed, the end result is the same. More importantly however, the number of systems this applies to is irrelevant. Whether it is 1 server, 10, 100, 1000 - the workflow is going to consistently deliver an end result in approximately 5 and a half minutes.

But Wait! There’s More!

Up to this point we have talked about a simplified workflow and shown the simple 3 steps one needs to run through for this demo to roll out. One might have noticed there has been little to no mention of ansible beyond the fact I feel it is better at configuration management. Let’s take a look at that.

There are a couple of ways this can be handled. In the case of this demo, I used Terraform’s remote-exec provisioner to do a bit of bootstrapping for ansible + git and then kicking off an ansible playbook.

    provisioner "remote-exec" {
        connection { 
          type    = "ssh"
          user    = "${var.ssh_user}"
          timeout = "500s"
          private_key = "${file("${var.ssh_key_path}${var.ssh_key_priv}")}"
        }
        inline = [
          "sudo yum -y install git ansible",
          "sudo ansible-playbook install-awx.yml"
        ]
    
      }

Depending on your needs, one might consider using the local-exec provisioner instead. This would remove the requirement of bootstrapping ansible and git on the remote system.

Wrapping Up

Thinking back to my previous statements that an organization needs consistency and efficiency at scale as well as portability, lets see how this solution of pairing Terraform and Ansible holds up

1) Is it efficient? 1,10,100,1000 servers with installed software in ~5m30 - Yes

2) Is it consistent? The only changeable pieces are project specific, the actual build remains the same - Yes

3) Is it portable? Some rework exists on the terraform to ensure you are using the correct providers. This demo was specific to GCP, however Terraform has providers for many cloud providers and on-prem technologies

Check out the demo in action on the Arctiq Team Youtube channel.

Tagged:



//comments


//blog search


//other topics