GCP-Terraform to deploy Private GKE Cluster.

Sumit K
Google Cloud - Community
10 min readMar 7, 2023

--

One of your most important decisions when creating a GKE cluster is deciding whether it will be public or private. Public Clusters are assigned with public and private IP addresses both and they can be accessed anywhere on the internet. However, on the other hand, Private clusters are assigned with a private IP address. Having said that, clusters are isolated from inbound and outbound traffic. Nodes in the private cluster can not be accessed by the client. The nodes can only connect to the internet with the cloud NAT. So this provides better security.

Secure GKE Private Cluster

Today, We are going to learn and deploy GKE private clusters. In GKE, private clusters are the clusters whose nodes are isolated from inbound and outbound traffic by assigning them internal IP addresses only. Private clusters in GKE have the option of exposing the control plane endpoint as a publicly accessible address or as a private address.

Please keep in mind, Nodes in private clusters are assigned with the Private IP. this means they are isolated from inbound and outbound communication until you configured cloud NAT.NAT service will allow nodes in the private network to access the internet, enabling them to download the required images from the Docker hub or the public registry else go for the private registry if you restrict both incoming and outgoing traffic. Hope this is clear. One more point that you should remember is that Private clusters can also have private endpoints as well as public endpoints.

  1. If you choose a public endpoint, you can manage your GKE cluster outside of GCP however you can whitelist the IP range in authorized networks to limit who can manage your cluster. this will allow you to manage your cluster from certain locations on the internet.
  2. If you decided to completely shut off a private and public endpoint then you won’t be able to manage it outside of GCP but you can manage it from GCP tools or the bastion host that you want to authorize its internal IP range. This approach is definitely the most restrictive and secure option and is widely adopted by organizations.

Kubernetes (K8s) is one of the best container orchestration platforms today, but it is also complex. Because Kubernetes aims to be a general solution for specific problems, it has many moving parts. Each of these parts must be understood and configured to meet your unique requirements.k8s is a complex tool but it brings great value to the business but sometimes at great cost in time, money, and developer frustration. Kubernetes really shines when your application consists of multiple services running in different containers. Kubernetes is powerful, but that does not mean it’s the right choice for every team and every app. I am not going much deep into it but you must have a basic understanding of containerization and Kubernetes. Alright! we have covered enough theories to get straight to the demo.

In this demo, you will create the following resources:

  • A network named vpc1.
  • A Subnetwork named subnet1.
  • A private cluster named my-gke-cluster has private nodes and has no client access to the public endpoint.
  • Managed node pool with 3 sets of nodes.
  • A Linux Jump host named jump-host with internal IP only. No Public IP is attached. This machine will be accessible over the internal ipv4 address using IAP.
  • A Cloud Nat gateway named nat-config
  • IAP SSH permission
  • Firewall rule to allow access to jump host via IAP.

To provide outbound internet access for your private nodes, such as to pull images from an external registry, use Cloud NAT to create and configure a Cloud Router. Cloud NAT lets private clusters establish outbound connections over the internet to send and receive packets.

I added the jump host’s internal IP in the private cluster’s master authorized network. This enabled secure connectivity only from the jump host.

Pre-requisite:

  1. A GCP Account with one Project.
  2. Service Account. Make sure the SA must have appropriate permission. you can make it an owner role
  3. gcloud CLI.
  4. Terraform
  5. Basic understanding of Terraform, Kubernetes, and GKE.
  6. Basic understanding of building and deploying containers.

Now you’re ready to get started:

Here is the Terraform configuration code that will be used to deploy our entire setup. Copy and paste this configuration and apply it.

Step1. First, create a folder for all of your Terraform source code files. Let’s call it “gke-demo”

mkdir gke-demo
cd gke-demo

Step2. Create a new file for the configuration block.

$ touch provider.tf
$ touch main.tf

Step3. Paste the configuration below into provider.tf and save it.


terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "4.8.0"
}
}
}


provider "google" {
region = "asia-south2"
project = "tcb-project-371706"
credentials = file("tcb-project-371706-b114ce01e529.json")
zone = "asia-south2-a"

}

Step4. Paste the configuration below into main.tf and save it.

# create VPC
resource "google_compute_network" "vpc" {
name = "vpc1"
auto_create_subnetworks = false
}

# Create Subnet
resource "google_compute_subnetwork" "subnet" {
name = "subnet1"
region = "asia-south2"
network = google_compute_network.vpc.name
ip_cidr_range = "10.0.0.0/24"
}

# # Create Service Account
# resource "google_service_account" "mysa" {
# account_id = "mysa"
# display_name = "Service Account for GKE nodes"
# }


# Create GKE cluster with 2 nodes in our custom VPC/Subnet
resource "google_container_cluster" "primary" {
name = "my-gke-cluster"
location = "asia-south2-a"
network = google_compute_network.vpc.name
subnetwork = google_compute_subnetwork.subnet.name
remove_default_node_pool = true ## create the smallest possible default node pool and immediately delete it.
# networking_mode = "VPC_NATIVE"
initial_node_count = 1

private_cluster_config {
enable_private_endpoint = true
enable_private_nodes = true
master_ipv4_cidr_block = "10.13.0.0/28"
}
ip_allocation_policy {
cluster_ipv4_cidr_block = "10.11.0.0/21"
services_ipv4_cidr_block = "10.12.0.0/21"
}
master_authorized_networks_config {
cidr_blocks {
cidr_block = "10.0.0.7/32"
display_name = "net1"
}

}
}

# Create managed node pool
resource "google_container_node_pool" "primary_nodes" {
name = google_container_cluster.primary.name
location = "asia-south2-a"
cluster = google_container_cluster.primary.name
node_count = 3

node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]

labels = {
env = "dev"
}

machine_type = "n1-standard-1"
preemptible = true
#service_account = google_service_account.mysa.email

metadata = {
disable-legacy-endpoints = "true"
}
}
}



## Create jump host . We will allow this jump host to access GKE cluster. the ip of this jump host is already authorized to allowin the GKE cluster

resource "google_compute_address" "my_internal_ip_addr" {
project = "tcb-project-371706"
address_type = "INTERNAL"
region = "asia-south2"
subnetwork = "subnet1"
name = "my-ip"
address = "10.0.0.7"
description = "An internal IP address for my jump host"
}

resource "google_compute_instance" "default" {
project = "tcb-project-371706"
zone = "asia-south2-a"
name = "jump-host"
machine_type = "e2-medium"

boot_disk {
initialize_params {
image = "debian-cloud/debian-11"
}
}
network_interface {
network = "vpc1"
subnetwork = "subnet1" # Replace with a reference or self link to your subnet, in quotes
network_ip = google_compute_address.my_internal_ip_addr.address
}

}


## Creare Firewall to access jump hist via iap


resource "google_compute_firewall" "rules" {
project = "tcb-project-371706"
name = "allow-ssh"
network = "vpc1" # Replace with a reference or self link to your network, in quotes

allow {
protocol = "tcp"
ports = ["22"]
}
source_ranges = ["35.235.240.0/20"]
}



## Create IAP SSH permissions for your test instance

resource "google_project_iam_member" "project" {
project = "tcb-project-371706"
role = "roles/iap.tunnelResourceAccessor"
member = "serviceAccount:terraform-demo-aft@tcb-project-371706.iam.gserviceaccount.com"
}

# create cloud router for nat gateway
resource "google_compute_router" "router" {
project = "tcb-project-371706"
name = "nat-router"
network = "vpc1"
region = "asia-south2"
}

## Create Nat Gateway with module

module "cloud-nat" {
source = "terraform-google-modules/cloud-nat/google"
version = "~> 1.2"
project_id = "tcb-project-371706"
region = "asia-south2"
router = google_compute_router.router.name
name = "nat-config"

}


############Output############################################
output "kubernetes_cluster_host" {
value = google_container_cluster.primary.endpoint
description = "GKE Cluster Host"
}

output "kubernetes_cluster_name" {
value = google_container_cluster.primary.name
description = "GKE Cluster Name"
}

Step5. Run terraform init, fmt, validate, plan and apply.

terraform init
terraform fmt
terraform validate
terraform plan
terraform apply

Step6. Let me apply this and see the outputs. you have to be a little patience because it’s gonna take 15–20 mins to create the GKE cluster. The reason for this delay is because of the time that takes in the deletion process of the default node pool and creating your manage node pool.

if everything goes as planned, the cluster should be up with all the workloads and services in a healthy state. For me, it took near about 20 mins to finish the deployment as you can see that resources have been created by Terraform. The same can be verified In UI. Let’s check…

terraform state list
google_compute_address.my_internal_ip_addr
google_compute_firewall.rules
google_compute_instance.default
google_compute_network.vpc
google_compute_router.router
google_compute_subnetwork.subnet
google_container_cluster.primary
google_container_node_pool.primary_nodes
google_project_iam_member.project
module.cloud-nat.google_compute_router_nat.main
module.cloud-nat.random_string.name_suffix

Google Cloud Console Screenshot :

VPC/Subnet with primary and secondary IP range
GKE Cluster status is Green and ready to use.
GKE workers nodes (private IP) and Jump host(private IP). Everything is private. we are very much secure :)
Public endpoint is disabled and the Private endpoint is internal
IP range for Control Plane, PODs, Service and Authorized networks
Cloud NAT

Now the question is How to Access/Connect your private cluster?

Since this cluster’s public endpoint is disabled, we can not access it outside of GCP such that the internet. So how do we access it?

Remember we have deployed a jump server within the same network with no public IP attached to it but you can still SSH to it over the internet with IAP. As I said before, This machine is already on the allowed list of the cluster’s authorized network. You can find the same in the code as well Wherein I have specifically mentioned its internal IP address, which means this machine is allowed to connect to your GKE cluster privately within the GCP network. Let’s see how it works:

Step1. Download the IAP desktop and install it on your laptop/system.

Step2. you need to log into your IAP desktop and sign in with your GCP account. Once logged in to IAP, you will see your Jump host and three worker nodes.

Step3. Right-click on jump-host and connect it. you should be in.

This is a fresh machine that comes with glcoud cli already installed. You can use these tools to perform many common platform tasks from the command line or through scripts and other automation. In order to access your cluster from this machine, you will need to first authenticate with your google account and then install kubectl.

kubectl is a command-line tool, that allows you to run commands against Kubernetes clusters.

Step4. Let’s authenticate first with “gcloud auth login”

Copy this link into a browser and enter the authorization code. Once done, you should be in.

Step5. Install kubectl. I already installed it. click here to follow the installation steps. Once installed, Let’s connect your cluster. copy command-line access from the UI.

please note: if you get any warning similar to the screenshot below, please run the following command to fix it.

sudo apt-get install google-cloud-sdk-gke-gcloud-auth-plugin

warning for gcloud-auth-plugin

Step6. You are now ready to use kubectl command which interacts with the cluster. You can manage your entire cluster and deploy your resources.

Cluster is accessible from the jump host

Now that the cluster is accessible, it’s time to deploy some containers and test them. Let’s quickly deploy some PODs and expose the service to the external load balancer.

Step1. Copy the below YAML configuration and paste it into web1.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:1.14.2
ports:
- containerPort: 80



---
apiVersion: v1
kind: Service
metadata:
name: lb-nginx-service
spec:
type: LoadBalancer
selector:
app: nginx
ports:
- name: http
port: 80
targetPort: 80
protocol: TCP

Step2. Apply the manifest file with the “kubectl apply -f web1.yaml”

As you can see, your deployment has been created and exposed to the external load balancer. Now Go ahead and copy the public IP of the load balancer and access it into the browser. You will see the Nginx Welcome page.

Congratulations!! You have completed this demo to create a private Kubernetes cluster. Now, if you do create a private cluster, then be aware that in order for it to access other Google APIs you will need to make sure that Private Google Access is enabled. I will cover this in another blog. I hope you like this article.

If you enjoyed this article, please clap n number of times and share it! Feel free to comment with any suggestions. Thanks for Reading!!

--

--

Sumit K
Google Cloud - Community

Humanity is the quality that we lack so much in real life, An Abide leaner, Cloud Architect⛅️, Love DevOps, AWS Community Builder 2023, Proud Hindu 🕉️