Scaling Terraform(IaC) across the team

HashiCorp Terraform is a popular tool for managing your cloud infrastructure as code (IaC) in a cloud-agnostic way (same tool for various cloud platforms). Instead of unifying all capabilities for different cloud platforms, the core concepts exposed to the end-user via Terraform provider concept. Terraform offers providers for all major cloud vendors and other cloud services and technologies as well, e.g. Kubernetes.  

This blog post doesn’t aim to be an introduction into Terraform concepts (official documentation is quite ok) but instead sharing an experience with using a Terraform in a distributed team, tools that come handy and all things that make life easier. Even though HashiCorp offers Terraform Enterprise this option is used quite rarely at least on “small/er” projects so we won’t be discussing this option here. I openly admit that I have zero experience with this service so I cannot objectively compare. I will solely focus on using the open-sourced part of the project and Terraform version 0.12.x and higher. 

Terraform maintains the sate of the infrastructure that manages in the state file. Format of Terraform state file is version dependant without strict rules on version compatibility (at least I wasn’t able to find one that was reliably followed and guaranteed). Managing state file poses two main challenges:
1) manage/share state across the team
2) control/align the version used across the team

The first aspect, different teams solve differently. Some commit state file alongside the configuration to version control system which is far from ideal as there might be multiple copies of such resource across the team and requires some team coordination. On top of that, state file contains sensitive information which is impossible to mask and such doesn’t belong to the source control system. A lot better approach is using Terraform remote backend, which allows true concurrent approach. Capabilities depend on the concrete implementation used. The backend can be changed from local to remote easily as is migrated automatically. The only limitation is that merging and splitting state file is allowed only for the locally managed state. 

Managing Terraform version management is centred around providing frictionless version upgrades for different Terraform configurations that align across the team with assuring that state file won’t get upgraded accidentally. To make sure that your state file won’t get upgraded accidentally put version restriction to every configuration managed e.g.

terraform {
  required_version = "0.12.20"
}

To align the team on a uniform Terraform version for every single configuration managed use tool for Terraform version management, e.g. tfenv. Put the desired version of to .terraform-version file located in the folder together with configuration. Tfenv automatically switches to appropriate version as needed, when new version encountered you need to run tfenv install to download a new version. If you want to check version available:

$ tfenv list
* 0.12.20 (set by /Users/jakub/test/terraform/.terraform-version)
  0.12.19
  0.12.18

As the number of resources or organisation grows so does the state file. Which leads to increased time for configuration synchronisation and competing for a lock on a remote state file. To increase the throughput and allow team DevOps mode(clear ownership of the solution from end to end), you might want to divide the infrastructure and associated state files into smaller chunks with clear boundaries. To keep your configuration DRY hierarchical configuration tools like Terragrunt comes to rescue and reduce repetition. 

A growing number of users poses challenges as well as benefits which are the same as on application code written in, e.g. java or any other programming language. What is the real motivation for Infrastructure as a Code (IaC). How to setup standards and best practices on the project? Terraform offers a bunch of tools embedded. To make sure that code is properly formatted according to standards use fmt utility, which is pluggable to your CI/CD pipeline or pre-commit hooks.

terraform fmt --recursive --check --diff

For your re-usable Terraform modules it is good to make sure they are valid though it doesn’t catch all the bugs as it doesn’t checks against cloud APIs, so it doesn’t replace integration tests.

terraform validate

Getting an idea what will change, diff of your current infrastructure against proposed changes can be easily achieved via the generated plan

terraform plan

Enforcing security and standards are a lot easier on IaC as you can use tools like tflint or checkov which allows writing custom policies. We conclude the tool section with awesome Terraform tools which provide a great source if you are looking for something specific.

In this blog post, we just scratched the surface of Terraform tooling and completely skipped design and testing, which are topics for separate posts. What are your favourite tools? What did you find really handy? Leave a comment, share your tips.