enjoys technical writing, and a cheeky drink 🥃

Please Test Your IaC Code

I'd begin with a story.

Recently, we migrated parts of our cloud workloads from Google Cloud Platform to another provider.

This meant we had to re-provision compute instances (VMs) that matches the original Google Compute Engine (GCE) resources' allocated CPU and memory.

In the Python script used, there is a simple dictionary translating the GCE machine type to its equivalent CPU and memory. As an example:

mapping = {
  "n1-standard-2": {
    "cpu": 2,
    "memGB": 7.5
  },
  ...
}

The subsequent code to provision the VMs did this:

...
req = mapping[machine_type]
cpu = int(req["cpu"])
memory = int(req["memGB"])

provision_vm(cpu, memory)
...

Have you noticed something amiss here?
Let's take a minute or so.

You see, the VM ended up under-provisioned (7 GB).

int(7.5) == 7

Thankfully, we caught this issue early within our own test environment.

Ultimately, scripts, Kubernetes manifests, Terraform code are still software code. Here is a gentle reminder that we can and should test or validate the soundness of our IaC code.

buy Kelvin a cup of coffee ☕