Writing better and more re-usable code in Terraform

Over the years I have, on numerous occasions, received questions from colleagues or customers if I could help them out with a bit of QA or tips on some Terraform code they have written. Typically these questions comes from people who have recently started using infrastructure as code, gone through the tutorials, deployed their first workloads, and are now hitting their first problems where their code starts to become harder to maintain and they feel like automation isn't giving them the increased productivity and quality of life the marketing promised.

I usually end up re-using many of the same tips I have given to others earlier, so I thought I might as well make a post out of it.

Follow established conventions and good practices

Every language and technology has their own conventions and good practices that should be followed unless you have good reasons to deviate from them. We seldom work alone in a vacuum where the outside world does not affect us, and when we are required to scale up with more people, invite outside collaborators or hand over our code to other people it makes it significantly easier if our repositories feel familiar and are easy to navigate.

I usually recommend reading the following guides (and even if the first two are written by teams in Google and Aws respectively, the content is just as valid if you're working with other vendors)

Start thinking in modules

The difference between solving a problem once and achieving efficiency at scale is to refactor our code into re-usable modules that solve generalized problems. I like to start with modules that are closer to a naive solution, but avoiding the biggest lock-ins, and avoid premature optimization that might never be necessary. Getting good at this comes with experience, and you will probably find yourself in situations where you have painted yourself into a corner many times before you feel comfortable.

I always recommend following the standard module structure and reading the module development-documentation, especially the "When to write a module"-section. Finding the right size where a module contains a logical group of resources without becoming too complex, unwieldy and difficult to understand is more art than engineering.

Use tools to automate boring tasks

Writing code also includes a lot of boring tasks. I don't like boring tasks. They do not spark joy and I want to spend my time solving problems and creating something of value.

The good thing is that we have a lot of tools to automate this. My favorites among them are:

terraform-docs - Automatic generation of documentation that can be inserted into the README-file. An important aspect is that while this gives us a great overview of dependencies, which resources are created, input variables and module outputs it does not explain "why" we have solved a problem a certain way. The more high-level documentation on decisions should still be described in pull requests, discussions, wiki's etc.
checkov or Trivy - Run code analysis to find misconfigurations and security vulnerabilities.
tflint - Better linting with plugins for major cloud providers and the possibility to extend with rule sets. There is a lot of errors that terraform validate or terraform plan is not able to detect before you run terraform apply. Shifting left and detecting these errors with a linter is easier than fixing a broken state.
infracost - Creates cost calculations on changes. Again, if we can detect our misconfigurations or galloping cloud bills early we save ourselves trouble later

All of these tools can (and should) be set up to run automatically with the help of CI/CD tools in Azure DevOps, GitHub, GitLab etc. to make sure that our code is merged to our main branch without passing checks.

I also like to use pre-commit-terraform on my local machine to give me a shorter feedback loop than having to wait for CI/CD and possibly switch context between my IDE and a browser to verify that my code is ok. The downside is that I put more burden on co-workers to have dependencies installed and maintained on their machines. Which leads us to the final tip of this post:

Make it easy for others to contribute

To avoid having to maintain documentation on dependencies for developers, I like adding configuration for GitHub Codespaces in our repositories. This makes it very easy for others to launch a development container with everything pre-configured instead of asking people to follow an install guide to get started.

To further improve on this we also have an internal template repository for creating a Terraform module. If you want to create a new module, you simply specify the template when you create a new repository, and in magic fashion your repo is set up with the standard module structure, workflows for CI/CD, the codespaces configuration etc. It is all about creating a happy path where it is easier doing things right than trying to figure out your own good practices!