Engineers are great at making hard things possible. Then repeatable. Then automatable. Then run on their own. Software development has repeated this cycle over and over. We build a new capability; we make it easier to repeat with scripts, we let it run on its own with fixed functions, and then we abstract it into code where we no longer worry or think about it.

Today we build workflows decoupled from the technologies themselves, with code that is expressive and declarative, based on end goals and desired states.

But when it comes to the way we operate and scale the underlying infrastructures that power our software, we look more like a Formula One pit crew — a high-performance team of specialists that need to continually triage, tune, adjust, and repair.

Infrastructure at our fingertips

The cloud has transformed how we build software. With a single API call, you can spin up more compute resources than existed in all of the last decade combined. Developers went from thinking about physical machines to living in a world of network, compute, and storage as virtual resources. These resources have led to a generation of “cloudified” (serverless) services and “containers” (self-contained runtime environments) for almost every part of the developer stack, reducing operational cost, complexity, and engineering lead times.

And there is a lot to like about this new world of containers and serverless. Powerful Open Source and commercial projects such as Git, Jenkins, Cloudflare, Terraform, Puppet/Chef/Ansible, and Kubernetes have dramatically improved how we deploy infrastructure, applications, and manage workflows. (Disclosure: My firm is an investor in Cloudflare.)

That’s until you need to start stitching them all together. At that point, you enter an unwieldy sprawl of static configurations, scripts, and files.

Configuration sprawl

If this all sounds overly complex, that’s because it is. Before software is even written, it can take hours to configure and deploy the underlying cluster or workflow. Workflows are built on the back of static scripts and configuration files, lacking versioning or testability. Scale and performance require manual tuning. Configuration files and documentation go out of sync faster than DevOps can keep up. Triaging errors and failures requires hours of manual tracing. Best practices and patterns are hard to enforce. With constant operator intervention needed to keep it all running, the teams building and maintaining infrastructure are often larger than the teams building the services or applications on top of it.

This complexity is becoming magnified in a COVID-19 world as companies deal with impacted operating plans and distributed workforces. With limited resources and increasing demand load, unbundling the complexity and eliminating the manual and repetitive tasks that plague DevOps and engineering teams has never been more important.

We’ve been stuck with tools and configurations that fixate on the underlying technologies, not on workflows describing end goals and desired states.

This presents us with an opportunity to build a new generation of tools and workflows that will programmatically operate infrastructure the same way we use code to build our software.

Everything as code

The idea of “everything as code” (EaC) is emerging across each layer of the stack. It involves writing infrastructure as software, where everything is expressed with code.

With code, you can describe the end result or the series of steps to take. It’s declarative and expressive. EaC introduces a world where our infrastructures, workflows, and services start to demonstrate the mature, programmatic, and resilient patterns we’re used to:

  • Version controlled and immutable
  • Maintainable, testable, and collaborative
  • Modular, composable, and separate
  • Auto-scalable and resource-pool dynamically
  • Predictable and consistent
  • Linters and static analysis to enforce consistency
  • Graceful self-healing and failovers
  • No need for constant operator intervention
  • Secure and upgradable

EaC changes the focus from manual, repetitive tasks to workflows based on end goals and desired states. Bringing how we manage infrastructure closer to the maturity of how we build software.

Companies like HashiCorp, Cloudflare, Confluent, Amazon, Puppet Labs, Astronomer, and more, along with a thriving open source community, are fundamentally changing everything from how we deploy and operate servers, environments, containers, pipelines, and more, as code.

EaC represents a major shift and catch-up. The cloud unlocked new capabilities to build software faster, cheaper, and better. The velocity of new capabilities grew faster than we could imagine. The faster we adopted them, the more the software and tools to operate them were buckling at the knees. Our legacy workflows can no longer keep up with the speed of the cloud. Now is the time to catch up and build a new wave of tools, capturing everything as code.

Ethan Batraski is a Partner at Venrock.


The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here