Infrastructure as code in government and regulated industries

From IC Insider HashiCorp

Explore policy as code & security models for terraforming

Modern information technology (IT) abstracts infrastructure from physical hardware using logical constructs programmed in software. This abstraction accelerates the shift from static, on-premises, physical infrastructures to logical, distributed, software-defined resources that can be provisioned and de-provisioned rapidly using infrastructure as code (IaC).

IaC defines infrastructure in software, effectively codifying IT resources, systems, and services into machine and human-readable formats. With IaC, a user can provision, change, and version “logical” resources (software-defined versus hardware) in any environment.

Most major Cloud Service Providers (CSPs) have integrated IaC tools within their vertical platform, such as Amazon Web Services (AWS) Cloud Formation or Microsoft Azure Resource Manager. However, these tools do not scale horizontally across multiple CSPs, support traditional data centers, or bare metal. Terraform provides IT administrators with the horizontal, multi-cloud workflow they need.

Source: HashiCorp

IT and DevOps professionals in highly regulated industries must carefully select their IaC tools based on future (threats) requirements as teams organize and scale to ensure they are flexible enough to adapt quickly.

CISOs and security administrators need proof that policies and governance are being enforced at scale, or they write and manage those policies in non-standard, non-automated ways. IaC provides two primary benefits to regulated industry security teams.

1). Security automation using policy as code removes traditional barriers to secure application delivery by encoding infrastructure.

2). A robust security model for provisioning infrastructure in a standardized and compliant way (we will give five examples).

IaC effectively creates immutable infrastructure and encourages a best practice that leverages policy as code to complement a robust security model. Policy as code supports a well-defined security model, enables DevOps practices, and increases speed to the mission or market.

In the following sections, we will explore this security model in depth with five examples. The goal is to help you understand the risk of using terraform OSS without the proper security model and allow you to standardize and future-proof IaC.

Example One

Bad Example – Bob is a platform engineer supporting a government organization early in the cloud adoption journey. Bob used Amazon S3 and Amazon DynamoDB to store his state file (a state file is how IaC sees the current “state” of the infrastructure provisioned) but also shares the same S3 bucket with Jane, who has the same write access to the bucket. Jane leaves the Company, and her access is revoked. None of the changes Jane made while employed can be rolled back, causing production errors.

The IAM architect must then grant Bob elevated privileges to fix the state and try to create a similar role for a new hire. The change causes more issues with state management, costing the Company time and money to deal with the complex, custom architecture, also known as scaffolding. During this time, the authorization policy model is broken consistently due to the panic from recovering from outages. Multiple non-FTEs are granted access to the production state, which is a secret. They can all obtain sensitive information later uncovered in an investigation and audit that came with costly fines to the Company.

Suitable Example – Bob builds a business case with his manager Amanda, the Platform PM, to find a way to save time and money and reduce risk. They prove how all the workspaces will have a logical security boundary within the organization—gated with policy as code which allows developers on contract to speed up work. All transactions are audited and logged seamlessly within a set authorization policy model. The state is versioned in a git repo and protected by advanced encryption for regulated industries (FIPS 140-2). They go all year without an outage, investigation, fines, or loss of FTEs.

Example Two

Bad Example – Joe uses a third-party platform to provision infrastructure. The communication path goes from the VCS system to the CSP. PKI secrets are being used but are not centrally managed for transport layer security (TLS). The Terraform modules cannot be kept private as they are stored in the unsupported version control system (VCS), which the third-party provider argues is “good enough” for the organization. They cannot offer support for Terraform or a road map to help them tie the pieces together, and the project is left vulnerable based on the lack of a transparent security model.

One of the modules was supposed to be private but was exposed due to the secrets leakage, lack of supported features for private module registries, and secure communications from Terraform clients to the  Terraform server connected to the CSP. Joe tries but cannot get Terraform support without a license. Unfortunately, the project is set back a year due to the time it took to procure the Terraform license and get started correctly.

Suitable Example – Terraform clients connect to the private VCS instance; they ensure secure TLS communication and data confidentiality.  Jane follows a well-defined threat model. She ensures Confidential information, like API keys for communications, Terraform state, and overall data security, is protected. They stay on time and budget due to planning and budgeting for the appropriate license upfront.

Example Three

Bad Example – Sally runs all Terraform operations on her laptop. There is no isolation between Sally’s file share on the computer and the file share on the shared drive. Sally has around 130 workspaces saved onto the share drive. These must be manually transferred from BitBucket to the share drive weekly.

Suitable Example – Jane ensures that each Terraform operation (plan and apply) happens in an ephemeral environment created immediately before each run step and destroyed after completion, reducing the potential for unwanted exposure. The built environment isolates Terraform executions and Terraform commercial tenants. She can quickly scale to enterprise standards and does not have to run around with discs transferring secrets from Bit Bucket to the share drive or spread the attack surface area without clear security boundaries between systems.

Example Four

Bad Example – Alice is an engineer supporting a military organization. She has planned to provision all infrastructure for deployment with Terraform OSS. She has 16 nodes on different sites that depend on zero downtime deployments. Alice plans to build complex scaffolding at the recommendation of the military unit’s contractors. Alice intended to ensure there was no outage. Unfortunately, the infrastructure state was created and stored so that she could not roll back or recover any data, and the incident ended on the news due to a significant outage at the base.

Suitable Example – Alice has spread terraform enterprise across multiple availability zones, including on-site deployments and a few landing zones in AWS and Azure. She is confident that the SLA will be met and there will not be another isolated outage.

Example Five

Bad Example – In a high-threat environment, allow changes to be made in the UI by many people without tracking or attesting by whom those changes were made in either Terraform commercial or your VCS of choice.

Suitable Case Study – In a high-threat environment, provide read-only access to the Terraform commercial UI using a VCS like GitHub, GitLab, or BitBucket as the “source of truth” to limit exposure based on your existing workflow. All data models and resources used, i.e., users, teams, and run-time tasks, are all in the TFE Provider (fully supported by HashiCorp). Just use that to codify your Terraform Cloud or Enterprise configuration and keep your existing VCS workflow.

Some other key points to consider that are offered in the Terraform commercial security model are:

1.    Enforce Strong Authentication and MFA

2.    Minimize the Number of Users in the Owners Team

3.    Apply the Principle of Least Privilege to Workspace and Project Membership

4.    Protect API Keys and make them ephemeral with Vault

5.    Control Access to Source Code

6.    Restrict Access to Workspace State

7.    Use Separate Agent Pools for Sensitive Workspaces

Policy as Code supports the security model but is not directly part of it. One hundred percent of these breaches cited could have been prevented with Policy as Code. Cloud misconfigurations account for 82% of breaches. In highly regulated industries, compliance slows developers down, causing tension with security teams. Policy as Code speeds up developers dramatically by removing slow processes and adding speed to delivery.

IaC as a Core Security Function

You can adopt this security model by moving from Terraform OSS and DIY (“do-it-yourself”) to Terraform Enterprise or Cloud. The threat landscape continues evolving, as does the road map of the Terraform commercial offerings. Identity replaces the network perimeter as the new security control point. Stopping secret sprawl starts with IaC and your version control system of choice. You can start this today with ephemeral just-in-time (JIT) credentials. Terraform Enterprise works no matter which VCS you select, including GitHub, GitLab, Bitbucket, and many more. Leveraging Terraform Enterprise future-proofs your organization in delivering IT infrastructure.

 Common Sense

Remember, most CIOs may only have a limited view of what is going on in their respective regulated industry or government entity. Ask your developers or CISO about the threat model and how they secure IaC. The following Venn diagram depicts what “The New King Makers” describes as the new reality.

Your developers either adopt a platform approach or attempt to DIY their way to a complex mess; this cannot be a choice in a regulated industry or government.

Please reach out; we are happy to help with the adoption, training, and education of secure immutable IaC in government and regulated industries.

About HashiCorp

HashiCorp is the leader in multi-cloud infrastructure automation software. The HashiCorp software suite enables organizations to adopt consistent workflows to provision, secure, connect, and run any infrastructure for any application. HashiCorp open source tools Vagrant, Packer, Terraform, Vault, Consul, and Nomad are downloaded tens of millions of times each year and are broadly adopted by the Global 2000. Enterprise versions of these products enhance the open source tools with features that promote collaboration, operations, governance, and multi-data center functionality. The company is headquartered in San Francisco and backed by Mayfield, GGV Capital, Redpoint Ventures, True Ventures, IVP, and Bessemer Venture Partners. For more information, visit www.hashicorp.com or follow HashiCorp on Twitter @HashiCorp.

About IC Insiders

IC Insiders is a special sponsored feature that provides deep-dive analysis, interviews with IC leaders, perspective from industry experts, and more. Learn how your company can become an IC Insider.