Infrastructure Automation for Oklahoma Data Teams: A Practical Comparison

Managing infrastructure for data workloads is genuinely harder than it looks from the outside. You are not just spinning up servers. You are orchestrating pipelines, warehouses, compute clusters, networking, and often a mix of on-premise hardware and cloud resources, sometimes all at once. Getting that infrastructure wrong means broken pipelines, data loss, and engineers spending their Fridays doing manual work that should have been automated months ago.

The good news is there are real tools for this. The less good news is that the landscape has fragmented significantly over the past few years and the right choice depends heavily on your team, your environment, and how you actually work day to day.

We have settled on OpenTofu as our default at InFocus Data, but we have worked with most of the major options across client environments. Here is an honest look at what is out there and what actually makes sense for the kinds of companies that make up most of Oklahoma’s data landscape: energy companies running hybrid environments, mid-market businesses with small engineering teams, and organizations where the data engineer is also doing infrastructure work because there is not a dedicated DevOps function.


What We Mean by Infrastructure Automation

Infrastructure automation is the practice of writing code to provision, configure, and manage your servers, networks, databases, and related resources, rather than doing those things by hand through a console or SSH session.

This is a meaningful distinction. Most teams manage infrastructure manually for a long time without it feeling like a problem. Someone knows how a server is configured because they built it. A VM gets created in vSphere by clicking through the wizard. A cloud instance gets stood up, someone installs the dependencies, and it works. That approach scales fine until it does not.

Infrastructure automation covers two related but distinct problems:

Provisioning is creating and defining the resources themselves. What cloud instances exist, how networks are segmented, what storage volumes are attached, what firewall rules are in place. Tools like OpenTofu, Pulumi, and CloudFormation work at this layer.

Configuration management is what happens on those resources after they exist. What packages are installed, what services are running, how application configuration is laid out. Tools like Ansible work at this layer.

A complete infrastructure automation strategy usually needs both. Most IaC tools do not configure what is running inside a server. Most configuration management tools do not create the server in the first place. Knowing which problem you are solving helps you pick the right tool before you invest time in the wrong one.


Why Infrastructure Automation Matters for Data Teams

Data infrastructure changes constantly. Pipelines get added. New data sources show up. You add a warehouse, a compute cluster, another environment for staging. Without a way to codify how all of that is provisioned and configured, you end up with what most shops actually have: a mix of stuff that was set up manually, documented inconsistently or not at all, and that nobody fully understands anymore.

That situation creates specific problems that are especially painful for data workloads:

Environments that drift apart. Your dev environment and your production environment start identical, then diverge over months of manual changes. Pipelines work locally and break in prod. Nobody can explain why because nobody tracked what changed. Debugging becomes archaeology.

Knowledge locked in one person. If the person who built your Airflow cluster or your Spark environment leaves, can someone else rebuild it from scratch? In most shops the honest answer is no. That is not just a bus factor risk. It is also a scaling problem: as you add workloads, you need more infrastructure, and if provisioning requires a specialist every time, it becomes a bottleneck.

No way to recover cleanly. When something breaks at the infrastructure level, the question is not just can you fix it but can you reconstruct it. Manual environments get rebuilt from memory and guesswork. Automated environments get rebuilt from code, the same way every time.

Audit and compliance gaps. Regulated industries and large enterprise environments increasingly need to demonstrate control over their infrastructure. That means knowing what changed, when, and who authorized it. Manual processes do not produce that record automatically.

Slower iteration. Spinning up a new environment for a new project or a new pipeline should be fast. If it requires a support ticket, a manual process, and a few days of back-and-forth, teams work around it, which usually means sharing environments in ways that cause other problems.

Infrastructure automation solves all of these by treating your infrastructure like code. It lives in version control. Changes go through review. You can spin up a copy of your environment, tear it down, or rebuild it from scratch. For data teams especially, that reproducibility matters because your infrastructure is part of your pipeline.


OpenTofu (and Terraform)

OpenTofu is the open-source fork of Terraform, created after HashiCorp changed Terraform’s license in 2023 from the Mozilla Public License to the Business Source License. For practical purposes, OpenTofu and Terraform are functionally identical right now. They share the same HCL syntax, the same provider registry, and nearly the same behaviors. OpenTofu is just the one that will stay open source.

If you are starting fresh today, start with OpenTofu. If you are running Terraform and have not had a specific reason to migrate yet, it is not urgent, but it is worth planning. We are in the middle of that migration ourselves.

What it does well: The provider ecosystem is unmatched. There is a Terraform and OpenTofu provider for nearly everything, including OpenStack, AWS, Azure, GCP, Kubernetes, and most major SaaS platforms. The declarative model is approachable for infrastructure teams that are not primarily software developers. State management, while a source of pain at times, gives you a real record of what exists and what changed.

The rough edges: HCL is not a general-purpose language. Once you need real logic, conditional configurations, dynamic lookups, or anything more than straightforward resource definitions, you end up fighting the language. State file management across a team gets complicated quickly. Remote state on an object store like S3 or a dedicated backend tool like Lynx handles it, but it is a decision you need to make early and one more thing to operate. OpenTofu does have provisioners that can run scripts or copy files to a host after it is created, so basic post-provisioning configuration is possible, but the HashiCorp documentation discourages relying on them for anything serious, and in practice they are fragile enough that most teams reach for Ansible instead once the requirements get more complex.

Best for: Teams with existing infrastructure expertise, mixed cloud and on-prem environments, and situations where broad provider support matters. This is the default starting point for most teams and it does not require much justification.


Pulumi

Pulumi does roughly what Terraform does but lets you write infrastructure in a real programming language: Python, TypeScript, Go, Java, or C#. You are not writing HCL templates. You are writing actual code that provisions resources.

This sounds appealing until you have used it for a while. The power is real, but so are the tradeoffs.

What it does well: Complex logic is dramatically easier. Loops, conditionals, abstractions, anything that requires real programming, Pulumi handles cleanly. For Python-heavy data engineering teams, the ability to write infrastructure in the same language as your pipelines has genuine value. The provider ecosystem is solid and it handles both cloud and on-premise resources.

The rough edges: The state management model is similar to Terraform, but you are now tied to Pulumi’s hosted service unless you self-host the state backend. The debugging experience when something goes wrong is often harder than with OpenTofu because the abstraction layer adds distance between you and what is actually happening. It is also a real programming skill requirement. If your infrastructure team is not comfortable writing Python or TypeScript, this creates more problems than it solves.

Best for: Data engineering teams that are already Python or TypeScript developers, situations involving complex conditional resource creation, or teams that want to colocate infrastructure code with application code in the same repository. If your team would rather write a function than a template, Pulumi is worth a serious look.


Ansible

Ansible is a configuration management and automation tool, not a provisioning tool in the same sense as Terraform or Pulumi. But for many teams, it fills gaps those tools leave behind.

Where OpenTofu defines what resources exist, Ansible configures what is running on them. Installing packages, managing services, deploying applications, pushing config files, running post-provisioning steps. These are where Ansible lives.

What it does well: Agentless. Nothing to install on managed nodes beyond SSH access. YAML playbooks are readable by people who are not software engineers, which matters in shops where the infrastructure team comes from a sysadmin background. It handles bare metal, VMs, and cloud instances equally well and is excellent for day-2 operations like OS patching, config updates, and service restarts.

The rough edges: It is procedural, not declarative. Running a playbook twice can produce different results depending on current state. Idempotency is achievable but requires deliberate effort from whoever is writing the playbooks. Large playbooks tend to accumulate and become difficult to maintain over time.

Best for: Configuration management alongside OpenTofu, bare metal environments, shops with existing Linux admin expertise, and situations where you need to automate steps after resources are provisioned. OpenTofu can cover basic host configuration through its provisioners, and for simple cases that is enough. Ansible becomes the better choice when configuration logic grows beyond a few commands, when you need to manage configuration across many hosts consistently, or when you want configuration to be testable and reusable independent of provisioning. For most data teams, OpenTofu provisions the infrastructure and Ansible configures it.


AWS CloudFormation

CloudFormation is AWS’s native infrastructure automation tool. It ships with your AWS account, costs nothing extra, and has deep integration with every AWS service. If you are running purely on AWS and have no plans to leave, it is worth knowing.

What it does well: Native service integration is hard to beat. New AWS features often appear in CloudFormation before they show up in third-party providers. No external tooling to install or maintain. Stack management and drift detection are genuinely useful features for AWS-native environments.

The rough edges: It is AWS only. The moment you have anything outside of AWS, on-premise resources, a second cloud, third-party services, you need another tool anyway. The YAML and JSON template syntax is verbose and tedious to work with at scale. Error messages when something fails are notoriously unhelpful, often requiring significant time to trace back to the actual problem.

Best for: Teams that are 100% AWS, have no on-premise resources, and value deep native integration above all else. If there is any chance you will need to manage non-AWS resources, you will eventually outgrow CloudFormation and wish you had started elsewhere.


AWS CDK

The AWS Cloud Development Kit lets you write CloudFormation infrastructure using TypeScript, Python, Java, or Go. It compiles down to CloudFormation templates, so you get native AWS integration with a real programming language instead of YAML.

What it does well: Much better developer experience than raw CloudFormation. If your team is already writing TypeScript or Python, CDK feels natural. Constructs (reusable components) let you build sensible abstractions and share infrastructure patterns across projects.

The rough edges: Still AWS-only. When something goes wrong, you often need to understand what CloudFormation template your code generated, which adds a layer of indirection to debugging. The underlying CloudFormation limitations do not go away just because you wrote Python on top of them.

Best for: Developer-heavy teams committed to AWS that want the native integration without writing raw CloudFormation. If your engineers would rather write code than templates and you are not going to leave AWS, CDK is the most productive path. Just understand you are accepting AWS lock-in by going this direction.


Crossplane

Crossplane is infrastructure automation built on Kubernetes. You describe infrastructure resources using Kubernetes custom resource definitions, and Crossplane reconciles actual state to match your declarations. GitOps-native by design.

What it does well: If you are already running Kubernetes heavily, Crossplane fits naturally into a GitOps workflow. Everything becomes a Kubernetes resource. You get the Kubernetes reconciliation loop, which is a fundamentally different model from Terraform’s point-in-time applies and can be more reliable in environments where state drift is a persistent concern.

The rough edges: The learning curve is steep. You need to understand Kubernetes reasonably well before Crossplane adds value rather than complexity. Provider maturity varies significantly. The ecosystem is not as broad as OpenTofu’s. For teams not already invested in Kubernetes, this is a substantial amount of infrastructure to adopt just for infrastructure automation.

Best for: Platform engineering teams running Kubernetes-heavy environments. Most Oklahoma companies do not have a dedicated platform engineering function, and building one around Crossplane is a significant investment to take on before you have the team size to justify it.


On-Premise and Private Cloud Considerations

Cloud-only infrastructure automation is simpler in some ways. The providers are mature, the documentation is everywhere, and managed services handle a lot of complexity for you.

On-premise is messier. You are dealing with actual servers, network configuration, storage systems, and often older hardware that was not designed with automation in mind.

For teams running OpenStack, Terraform and OpenTofu have a solid provider. You can manage VMs, networks, security groups, floating IPs, and storage volumes declaratively the same way you would manage AWS or Azure resources. This is how we approach it for our own infrastructure and for clients running private cloud environments. The OpenStack provider documentation can lag behind OpenStack releases, and some advanced features require careful handling, but the core operations work reliably.

For environments without a private cloud layer, Ansible becomes more important. Direct VM or bare metal configuration, OS-level management, and service orchestration need a procedural tool, and Ansible is the most widely understood option for that work.

VMware vSphere environments have a Terraform provider that covers most common operations. It is workable, though the vSphere API adds complexity you do not encounter with cloud providers. If your shop runs vSphere and is seriously considering infrastructure automation adoption, plan for some extra time in the initial setup.

The practical combination that works across most hybrid environments: OpenTofu for resource provisioning across both cloud and on-premise, Ansible for configuration and day-2 operations. It is not the most sophisticated stack, but it is one that most teams can actually operate without constant specialist involvement.


What Makes Sense for Different Team Types

Oklahoma’s business landscape shapes which tools are actually practical here. The state’s economy is dominated by energy, and outside of oil and gas you are mostly looking at healthcare, agriculture, aviation, and a range of mid-market companies that do not have large dedicated engineering organizations. That means most data teams are small, infrastructure work often falls to the same people doing data engineering, and the tooling has to be maintainable without a full DevOps headcount.

A few patterns worth calling out specifically:

Small teams where data engineers also own infrastructure: This is the most common situation in Oklahoma outside of the largest operators. One or two engineers are responsible for pipelines, warehouses, and the infrastructure running both. OpenTofu with remote state is the right call here. It is approachable, well-documented, and does not require a separate specialist to maintain. Adding Ansible for configuration handles the cases OpenTofu does not.

Oil and gas operators with hybrid environments: Energy companies in Oklahoma commonly run operational technology on-premise (production data, SCADA systems, field data) and use cloud for analytics and reporting. The data is often sensitive from a competitive standpoint and subject to regulatory requirements, so there are good reasons to keep some of it off public cloud entirely. OpenTofu handles this well because the same tool and workflow works against both OpenStack or vSphere on-prem and AWS or Azure in the cloud. The plan and apply workflow also gives you an audit trail of infrastructure changes, which matters when demonstrating control to auditors or regulators.

Mid-market companies without a dedicated DevOps team: The same recommendation applies: OpenTofu plus Ansible. Pulumi is a reasonable alternative if the team is Python-first and finds HCL genuinely painful, but the added complexity of managing Pulumi’s state backend and debugging abstraction layers is harder to justify when there are only a few engineers who will ever touch the infrastructure code.

Developer-heavy product companies: These exist in Oklahoma, particularly in the OKC and Tulsa startup ecosystems, though they are not the dominant profile. Pulumi makes more sense here, especially for teams that want infrastructure code to live alongside application code in the same repository and go through the same review process.

Larger operators with dedicated platform teams: The bigger energy companies and healthcare systems in Oklahoma sometimes have the engineering headcount for a more sophisticated approach. Crossplane becomes relevant at this scale, particularly if Kubernetes is already a first-class part of the infrastructure. Below that level of investment, it adds more complexity than it solves.


The Honest Take

If you are starting from zero today, start with OpenTofu. The provider ecosystem covers everything you will need, it handles both cloud and on-premise resources, and there is a large community with real-world solutions to real-world problems. Add Ansible where you need configuration management and leave it at that until you have a specific reason to do otherwise.

For most Oklahoma companies, the constraint is not which tool has the best feature set. It is which tool a small team can actually own without it becoming a maintenance burden on top of everything else they are already responsible for. Pulumi is worth a serious look if your team is Python-first and finds HCL genuinely frustrating. CloudFormation and CDK make sense if you are genuinely AWS-only and intend to stay that way. Crossplane requires a level of Kubernetes investment that most teams here are not running and should not take on just for infrastructure management.

The best choice is the one people on your team will use consistently, understand well enough to fix when it breaks, and not abandon six months in because it was built for an engineering organization three times your size.


Further Reading

Get in touch