A long while back, a16z’s Marc Andreessen quipped “software will eat the world”. Nowadays, it feels more like GPT will be the one feasting, but until then, there’s a particular corner that software has been successfully dining on. Infrastructure. Better known as Infrastructure as Code, or IaC, the programmatic infrastructure space is rife with tools. IaC has dramatically automated some core DevOps workflows and gives developers the ability to personalize flows.
IaC might sound like a simple problem with a straightforward solution—code that operates infrastructure. However, IaC is a complex space. It has many solutions with approaches that differ in interesting ways. Players operate under two very different paradigms—declarative and imperative programming. Hashicorp’s Terraform and AWS’s CloudFormation are great declarative examples; Ansible, conversely, is the household imperative IaC name.
Today, we don’t want to argue if you should use Terraform, Ansible, CloudFormation, or any other specific player. Rather, we want to focus on the underpinning philosophical differences between the platforms, particularly between Terraform and Ansible. In fact, we’ll uncover that Ansible and Terraform are so different that they are actually quite compatible in conjunction.
The reason for focusing on Terraform and Ansible is they initially popularized the IaC and each champion different paradigms. Make no mistake, however; they are hardly the only competitive offerings winning market share. Many developers today use Pulumi (which is a language-agnostic framework) or AWS CDK (similar product by Amazon) instead of Terraform; however, comparing Terraform and Ansible can help us understand the grander space.
My favorite analogy when comparing declarative and imperative (or procedural) programming is related to buying and furnishing a house (albeit never having done either).
Declarative programming is similar to an architect’s blueprint. A blueprint states what needs to be done but does not specify when or in what order. It isn’t a recipe, just a specification. The actual implementation is left to someone else.
That’s exactly what declarative programming is. Developers state an infrastructure specification but don’t get to define how that infrastructure will be generated. The specific solution (e.g. Terraform) will decide the rest, reconciling an infrastructure’s existing state with the declared state.
Imperative programming is akin to furniture assembly instructions. Assembly instructions define exactly what needs to be done, step-by-step. It defines what piece works with what, in what specific order should steps be done, and is repeatable.
Imperative programming is no different. The programmer must specify each step to accomplish the stated goal. In the realm of IaC, that means provisioning infrastructure, usually by calling a cloud provider's APIs.
Before reconciling Terraform and Ansible’s design decisions with their programmatic paradigm (and purposes), let’s dive through the basics of each.
Terraform is a declarative IaC product developed and maintained by HashiCorp. Terraform's configuration language is based on a more general language called HashiCorp Configuration Language (HCL) which has influences from both JSON and Python. To a typical JavaScript developer, Terraform is like a package.json file but for infrastructure.
Even though Terraform is developed by for-profit HashiCorp, it is an open-source framework. It pairs nicely with other HashiCorp products like Vagrant and Consul, but Terraform is often used stand-alone.
Notably, Terraform enables developers to take an "immutable infrastructure" approach. If you need to change an existing resource, you don't modify it. Instead, you create a new one and once the new version is available, destroy the old one. This approach helps eliminate a whole class of failure states that can occur if you try to modify a resource in place.
One more core tenet of Terraform is that it is agentless. It doesn’t require a daemon process to operate; it directly integrates with a cloud platform like AWS and GCP using something known as a provider library. Providers for larger cloud platforms are developed by HashiCorp itself; for smaller cloud platforms, they might be developed by the general Terraform community. This open-ended approach makes Terraform a very vendor-agnostic product.
Ansible is billed as an imperative IT automation tool. Ansible is managed by another massive for-profit entity (Red Hat), but is also open-source like Terraform. Unlike Terraform, Ansible uses a general community language (YAML) for specification files. Ansible is typically executed via Ansible Playbooks, which specify how infrastructure should be maintained and operated.
Ansible is also agentless like Terraform, connecting to AWS via Access and Secret Keys, which can be managed by a vault solution like Ansible Vault. Ansible has two core files—inventory, which informs Ansible of available resources, and playbooks, which are manifest files that detail what will happen on those resources.
Because Ansible is a procedural language, it is executed from top to bottom. If something takes an action before it is initialized, Ansible will return an error. To a typical Javascript developer, Ansible is like a Node app.js file.
Another competitor in the space is CloudFormation, a proprietary tool developed by AWS. CloudFormation is declarative like Terraform; its biggest differences with Terraform surround how templates and runtimes are organized. Terraform extends a single-source-of-truth philosophy; CloudFormation is built around versioning and templating.
CloudFormation supports both JSON and YAML. What an over-achiever.
Another competitor in the IaC space is Chef. Chef is often compared directly to Terraform because it boasts similar tenets (declarative, open-source) but with a mutable model. Unlike Terraform where servers are never modified and instead are replaced, Chef alters existing servers. In short, Chef offers a hybrid approach to Ansible and Terraform, borrowing the declarative nature of Terraform but enabling mutable servers like Ansible.
A common point of confusion between Ansible and Terraform is their overlap. While they are fundamentally different products within the scope of IaC, they can do some similar things. For instance, both Ansible and Terraform can start resources, scale resources, and stop resources.
The difference is that Terraform excels at provisioning infrastructures and ensuring that infrastructure remains up-to-date with the Terraform declaration. This comes down to the fact that Terraform is up-to-date. There is no resource-not-defined issue with Terraform—.tf files are quite literally a single source of truth. But Terraform isn’t built to install Apache or start NGINX for example. Ansible’s procedural line-by-line nature enables it to install software with dependencies. So rather than relying on having a fully customized container or VM image available for Terraform, you could use Ansible to do the last mile installation and configuration after the resource is provisioned.
Ansible has inventory and manifest files. Inventory establishes what’s available, manifest instructs what is done on them. Meanwhile, Terraform just executes its manifest file and exports the created resources into an available state file. For Terraform and Ansible to work together, these sub-processes need to interact.
The nice thing is that both Terraform or Ansible could be the base platform! One doesn’t necessarily need to supersede the other. Either:
CPrime has an excellent guide that covers how to implement each option.
One benefit to Terraform's declarative approach is that certain details of how infrastructure is spun up can be ignored by the developer, decreasing cognitive load. Bob, from Pied Piper, doesn’t care if the Notification Server or API Server went online first; just that they can work together on the same VPC once the infrastructure is ready. Terraform makes life easier with declarative abstractions but the best part might be that you don't need to spend so much time clicking around the AWS web console.
So if managing infrastructure is well suited for a declarative approach, is Ansible's imperative approach the best way to manage system configuration? Dockerfiles and Chef Recipes are also popular configuration management tools that also happen to take an imperative approach. But before we jump to conclusions, can we find examples from the declarative world of configuration management? In 2005, Puppet was introduced and offered a view into what a declarative configuration management future could look like. Today, all you need to do is review a Packer script to see how HCL can be used to declaratively manage server configuration similar to how Terraform uses it to manage infrastructure. In the future, it might be Nix that proves declarative as the superior approach as it gains mass adoption. There are plenty of Nix fans who hope it will. Until then, both paradigms are popular and widely used in configuration management.
Oh boy. Another article that tries to connect GPT to literally anything. But, I predict that GPT will impact on how we view declarative code. If an advantage to declarative is that it abstracts away many details, a downside is that it becomes less obvious what is happening behind the scenes. As a result, declarative code can be difficult to debug. On the other hand, highly detailed, procedural instructions might take longer to read through but they are usually easier to debug. And this is how generative AI can help. A Terraform file can be easily interrogated by using chatGPT providing a convenient and useful window to what happens behind the scenes.
As an example, I asked GPT to create two databases within the same VPC, and it generating the following .tf file:
Then, I asked GPT if all the databases in the above declaration are within the same VPC. The result was very positive:
Yes, all databases in the above Terraform declaration are within the same VPC. The aws_db_subnet_group resource is used to specify which subnets the databases should be created in, and it lists aws_subnet.example1.id and aws_subnet.example2.id, both of which are within the aws_vpc.example VPC.
Terraform’s declarative nature makes it an excellent tool for manifesting the perfect infrastructure into existence (pun intended). Ansible’s imperative nature, meanwhile is excellent at installing and running software on servers at scale. Terraform is great for teams that need complex infrastructure without room for error; Ansible is ideal for teams that need to run operations on many resources scalably. Often, however, both tools are used together; Terraform for provisioning infrastructure, and Ansible for managing it.
And, in a few years, our LLM overload GPT might be managing both products.