Add a new Kubernetes cluster#

This guide will walk through the process of adding a new cluster to our terraform configuration.

You can find out more about terraform in Terraform and their documentation.


Currently, we do not deploy clusters to AWS using terraform. Please see Create a new cluster for AWS-specific deployment guidelines.

Cluster Design#

This guide will assume you have already followed the guidance in Cluster design considerations to select the appropriate infrastructure.

Create a Terraform variables file for the cluster#

The first step is to create a .tfvars file in the appropriate terraform projects subdirectory:

Give it a descriptive name that at a glance provides context to the location and/or purpose of the cluster.

The minimum inputs this file requires are:

  • prefix: Prefix for all objects created by terraform. Primary identifier to ‘group’ together resources.

  • project_id: GCP Project ID to create resources in. Should be the id, rather than display name of the project.

  • regional_cluster: Set to true to provision a GKE Regional Highly Available cluster. Costs ~70$ a month, but worth it for the added reliability for most cases except when cost saving is an absolute requirement. Defaults to true.

  • zone: Zone where cluster nodes and filestore for home directory are created.

  • region: Region where cluster master (if regional_cluster is true) is run, as well as any storage buckets created with user_buckets.

See the variables file for other inputs this file can take and their descriptions.

Example .tfvars file:

prefix           = "my-awesome-project"
project_id       = "my-awesome-project-id"
zone             = "us-central1-c"
region           = "us-central1"
regional_cluster = true

The minimum inputs this file requires are:

  • subscription_id: Azure subscription ID to create resources in. Should be the id, rather than display name of the project.

  • resourcegroup_name: The name of the Resource Group to be created by terraform, where the cluster and other resources will be deployed into.

  • global_container_registry_name: The name of an Azure Container Registry to be created by terraform to use for our image. This must be unique across all of Azure. You can use the following Azure CLI command to check your desired name is available:

    az acr check-name --name ACR_NAME --output table
  • global_storage_account_name: The name of a storage account to be created by terraform to use for Azure File Storage. This must be unique across all of Azure. You can use the following Azure CLI command to check your desired name is available:

    az storage account check-name --name STORAGE_ACCOUNT_NAME --output table
  • ssh_pub_key: The public half of an SSH key that will be authorised to login to nodes.

See the variables file for other inputs this file can take and their descriptions.

Naming Convention Guidelines for Container Registries and Storage Accounts

Names for Azure container registries and storage accounts must conform to the following guidelines:

  • alphanumeric strings between 5 and 50 characters for container registries, e.g., myContainerRegistry007

  • lowercase letters and numbers strings between 2 and 24 characters for storage accounts, e.g., mystorageaccount314


A failure will occur if you try to create a storage account whose name is not entirely lowercase.

We recommend the following conventions using lowercase:

  • {CLUSTER_NAME}hubregistry for container registries

  • {CLUSTER_NAME}hubstorage for storage accounts


Changes in Azure’s own requirements might break our recommended convention. If any such failure occurs, please signal it.

This increases the probability that we won’t take up a namespace that may be required by the Hub Community, for example, in cases where we are deploying to Azure subscriptions not owned/managed by 2i2c.

Example .tfvars file:

subscription_id                = "my-awesome-subscription-id"
resourcegroup_name             = "my-awesome-resource-group"
global_container_registry_name = "myawesomehubregistry"
global_storage_account_name    = "myawesomestorageaccount"
ssh_pub_key                    = "ssh-rsa my-public-ssh-key"

Once you have created this file, open a Pull Request to the infrastructure repo for review. See our review and merge guidelines for how this process should pan out.

Initialising Terraform#

Our default terraform state is located centrally in our two-eye-two-see-org GCP project, therefore you must authenticate gcloud to your account before initialising terraform. The terraform state includes all cloud providers, not just GCP.

gcloud auth application-default login

Then you can change into the terraform subdirectory for the appropriate cloud provider and initialise terraform.

cd terraform/gcp
terraform init -backend-config=backends/default-backend.hcl -reconfigure
cd terraform/azure
terraform init


There are other backend config files stored in terraform/backends that will configure a different storage bucket to read/write the remote terraform state for projects which we cannot access from GCP with our email accounts. This saves us the pain of having to handle multiple authentications as these storage buckets are within the project we are trying to deploy to.

For example, to work with Pangeo you would initialise terraform like so:

terraform init -backend-config=pangeo-backend.hcl -reconfigure

Creating a new terraform workspace#

We use terraform workspaces so that the state of one .tfvars file does not influence another. Create a new workspace with the below command, and again give it the same name as the .tfvars filename.

terraform workspace new WORKSPACE_NAME


Workspaces are defined per backend. If you can’t find the workspace you’re looking for, double check you’ve enabled the correct backend.

Plan and Apply Changes#


When deploying to Google Cloud, make sure the Compute Engine, Kubernetes Engine, and Artifact Registry APIs are enabled on the project before deploying!

First, make sure you are in the new workspace that you just created.

terraform workspace show

Plan your changes with the terraform plan command, passing the .tfvars file as a variable file.

terraform plan -var-file=projects/CLUSTER.tfvars

Check over the output of this command to ensure nothing is being created/deleted that you didn’t expect. Copy-paste the plan into your open Pull Request so a fellow 2i2c engineer can double check it too.

If you’re both satisfied with the plan, merge the Pull Request and apply the changes to deploy the cluster.

terraform apply -var-file=projects/CLUSTER.tfvars

Congratulations, you’ve just deployed a new cluster!

Exporting and Encrypting the Cluster Access Credentials#

To begin deploying and operating hubs on your new cluster, we need to export the credentials created by terraform, encrypt it using sops, and store it in the secrets directory of the infrastructure repo.

Check you are still in the correct terraform workspace

terraform workspace show

If you need to change, you can do so as follows

terraform workspace list  # List all available workspaces
terraform workspace select WORKSPACE_NAME

Then, output the credentials created by terraform to a file under the secrets directory.

terraform output -raw ci_deployer_key > ../../config/clusters/<cluster_name>/deployer-credentials.secret.json
terraform output -raw kubeconfig > ../../config/clusters/<cluster_name>/deployer-credentials.secret.yaml

Then encrypt the key using sops.


You must be logged into Google with your account at this point so sops can read the encryption key from the two-eye-two-see project.

cd ../..
sops --output config/clusters/<cluster_name>/enc-deployer-credentials.secret.{{ json | yaml }} --encrypt config/clusters/<cluster_name>/deployer-credentials.secret.{{ json | yaml }}

This key can now be committed to the infrastructure repo and used to deploy and manage hubs hosted on that cluster.

Adding the new cluster to CI/CD#

To ensure the new cluster is appropriately handled by our CI/CD system, please add it as an entry in the following places: