Posted on Jul 18, 2022 • Originally published at loft.sh

Development Environments with vcluster

By Antonio Berben

At Solo.io, we listen to the community and try out the best technologies to help teams meet their goals. This includes both working on open source projects, as well as providing support and products that can help you better leverage technologies.

Gloo Mesh is one of those products. It provides a good example of how to reduce the complexity of managing the entire application networking in your infrastructure to a minimum. As can be understood, this implies multi-cluster architectures.

In such scenarios, how can you verify that a multi-cluster configuration is correct in a local environment before moving to a more extensive environment?

Let’s put it in context. Your team (or the development team) wants to release a new feature. They want to cause some chaos in the system. Gloo Mesh offers this functionality and many other through policies (FailOver, fault injection, outlier detection, retries, timeouts, mirroring, rate limiting, and more). But you, as an operator of the platform and Gloo Mesh, may not be sure which is the correct configuration. You need to investigate first in a development or testing environment.

In a simulated production scenario that uses three clusters (one for management and two for workloads), the first concern is obvious: Cost. Deploying three clusters in public clouds is expensive.

The second concern: networking. Let’s say you decide to investigate first in your local environment. Deploying three entire clusters in your own workstation is not easy. You can opt for solutions like multiple kind (kubernetes-in-docker) or k3d. Both deploy clusters in containers on top of the host machine. One cluster, one container. If you try one of these approaches, you probably have to tweak the network between the containers and the host machine.

The third concern: CPU. To deploy things in your own local environment, you need to make sure you have enough “muscle”.

Now… What if we start considering “A cluster within a cluster”?

vcluster

I hope you saw the iconic movie Inception. I enjoyed it a lot and I watch it again from time to time. The idea was pretty catchy: “A dream within a dream”.

Virtualization technology follows the same idea. If you are familiar with Docker, years ago there was the need for docker-in-docker. Nowadays it is a very common approach in CI/CD pipelines. Say for example that tasks are running in a container but you need to test an application already embedded in another container. This would be a use case of docker-in-docker.

Given that idea, what stops us from trying cluster-in-cluster? This is where vcluster comes in to offer some benefits. vcluster allows you to create and manage virtual Kubernetes clusters. A virtual cluster is basically a control plane that runs in a namespace on a shared host custer. Here a visualization:

In the picture we can see that Gloo Mesh, which before required three clusters to simulate a production-ready environment, now just needs one cluster with three virtual clusters.

Quick benefits:

Cost effective: Now, your cost is only one cluster. It is true that it needs to be bigger than before, but you’re saving money by deploying one cluster instead of three.
Time-saving: when you work in your local environment, you do not want to spend time creating new clusters. If you use kind, it can take several minutes to get three new clusters. With vcluster, you can get your three new clusters in about 20 seconds.

Let’s prove all this in a workshop.

Hands on!

In this workshop, in a matter of seconds, you will deploy Istio in the two workload clusters, a demo application to use in your labs, and Gloo Mesh to test the application networking capabilities (multi-cluster traffic, traffic splitting, fault injection, etc.). All this is based on just one host Kubernetes cluster containing three virtual clusters.

Your architecture will look like this:

Prerequisites

A Kubernetes cluster which will be the host cluster (kind, k3s, k0s, etc.)
vcluster CLI. This has been tested with version 0.10.2
Helm v3
Kubectl
meshctl

Getting Started

Let’s check on how long it takes you to deploy everything. The test was made using a virtual machine with only three CPUs. Therefore, you will also deploy components with minimum resources.

You start with setting up some environment variables:

# Context name for the host cluster export MAIN_CONTEXT=$(kubectl config current-context) # Context names for the gloo mesh clusters (vclusters) export MGMT_CLUSTER=devmgmt export CLUSTER_1=devcluster1 export CLUSTER_2=devcluster2

Install environments

First, let’s create management cluster:

cat << EOF > vcluster-values.yaml isolation: enabled: false limitRange: enabled: false podSecurityStandard: privileged resourceQuota: enabled: false rbac: clusterRole: create: true syncer: resources: limits: cpu: 100m memory: 1Gi requests: cpu: 100m memory: 128Mi extraArgs: - --fake-nodes=false - --sync-all-nodes vcluster: resources: limits: cpu: 200m memory: 2Gi requests: cpu: 100m memory: 256Mi extraArgs: - --kubelet-arg=allowed-unsafe-sysctls=net.ipv4.* - --kube-apiserver-arg=feature-gates=EphemeralContainers=true - --kube-scheduler-arg=feature-gates=EphemeralContainers=true - --kubelet-arg=feature-gates=EphemeralContainers=true image: rancher/k3s:v1.22.5-k3s1 EOF vcluster create $MGMT_CLUSTER -n $MGMT_CLUSTER --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT vcluster connect $MGMT_CLUSTER -n $MGMT_CLUSTER --kube-config-context-name $MGMT_CLUSTER --update-current --context $MAIN_CONTEXT kubectl --context $MGMT_CLUSTER get namespaces

Next, the workload cluster 1:

vcluster create $CLUSTER_1 -n $CLUSTER_1 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT vcluster connect $CLUSTER_1 -n $CLUSTER_1 --kube-config-context-name $CLUSTER_1 --update-current --context $MAIN_CONTEXT kubectl --context $CLUSTER_1 get namespaces

And finally, the workload cluster 2:

vcluster create $CLUSTER_2 -n $CLUSTER_2 --upgrade --connect=false --expose -f vcluster-values.yaml --context $MAIN_CONTEXT vcluster connect $CLUSTER_2 -n $CLUSTER_2 --kube-config-context-name $CLUSTER_2 --update-current --context $MAIN_CONTEXT kubectl --context $CLUSTER_2 get namespaces

This is it! Three clusters in around 20 seconds. If you’re interested to know more, at the end of this post, you can find a more in-depth explanation of what you have deployed with vcluster and some tips to remember.

Now, time for Istio to be deployed in the workload clusters:

Install Gloo Mesh

You will need a license key:

export GLOO_MESH_LICENSE_KEY=<license_key>

And you need to define the Gloo Mesh version:

export GLOO_MESH_VERSION=2.0.9

Gloo Mesh can be installed through Helm charts. However, to not overflow this post with code, you will use the meshctl CLI:

meshctl install --kubecontext $MGMT_CLUSTER --license $GLOO_MESH_LICENSE_KEY --version $GLOO_MESH_VERSION

Verify all pods are running:

kubectl get pods -n gloo-mesh --context $MGMT_CLUSTER

And you will see something like:

NAME READY STATUS RESTARTS AGE gloo-mesh-mgmt-server-778d45c7b5-5d9nh 1/1 Running 0 41s gloo-mesh-redis-844dc4f9-jnb4j 1/1 Running 0 41s gloo-mesh-ui-749dc7875c-4z77k 3/3 Running 0 41s prometheus-server-86854b778-r6r52 2/2 Running 0 41s

Register workload clusters

Gloo Mesh relies on an agent-based approach. Therefore, when registering a workload cluster, you will need to tell the agent how to communicate with the management server.

Note that in EKS the service does not return an IP, but an Address. Please make that adjustment the following commands if you're using EKS.

MGMT_SERVER_NETWORKING_DOMAIN=$(kubectl get svc -n gloo-mesh gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.status.loadBalancer.ingress[0].ip}') MGMT_SERVER_NETWORKING_PORT=$(kubectl -n gloo-mesh get service gloo-mesh-mgmt-server --context $MGMT_CLUSTER -o jsonpath='{.spec.ports[?(@.name=="grpc")].port}') MGMT_SERVER_NETWORKING_ADDRESS=${MGMT_SERVER_NETWORKING_DOMAIN}:${MGMT_SERVER_NETWORKING_PORT} echo $MGMT_SERVER_NETWORKING_ADDRESS

meshctl cluster register \ --remote-context=$CLUSTER_1 \ --relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \ --kubecontext $MGMT_CLUSTER \ $CLUSTER_1

And you will see:

Registering cluster 📃 Copying root CA relay-root-tls-secret.gloo-mesh to remote cluster from management cluster 📃 Copying bootstrap token relay-identity-token-secret.gloo-mesh to remote cluster from management cluster 💻 Installing relay agent in the remote cluster Finished installing chart 'gloo-mesh-agent' as release gloo-mesh:gloo-mesh-agent 📃 Creating remote.cluster KubernetesCluster CRD in management cluster ⌚ Waiting for relay agent to have a client certificate Checking... Checking... 🗑 Removing bootstrap token ✅ Done registering cluster!

meshctl cluster register \ --remote-context=$CLUSTER_2 \ --relay-server-address $MGMT_SERVER_NETWORKING_ADDRESS \ --kubecontext $MGMT_CLUSTER \ $CLUSTER_2 #### Check that the resource is created in management: ``` {% endraw %} Bash kubectl get kubernetescluster -n gloo-mesh --context $MGMT_CLUSTER {% raw %}

And you will see:

 Bash NAME AGE devcluster1 27s devcluster2 23s

Install Istio

Istio by default requires some resources. In your local environment, you might not have the resources to deploy three clusters fully functional and two Istio service meshes. Therefore, we need to reduce the required resources for Istio. That’s fine as this is just a development environment.

NOTE: This post is using Istio v1.12.6:

 Bash export ISTIO_VERSION=1.12.6

Install Istio’s CRDs:

 Bash # Install Istio CRDS cluster1 helm upgrade --install istio-base istio/base \ -n istio-system \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_1 \ --create-namespace # Install Istio CRDS cluster2 helm upgrade --install istio-base istio/base \ -n istio-system \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_2\ --create-namespace

Install Istiod:

 Bash cat << EOF > istiod-common-values.yaml meshConfig: accessLogFile: /dev/stdout defaultConfig: holdApplicationUntilProxyStarts: true envoyMetricsService: address: gloo-mesh-agent.gloo-mesh:9977 envoyAccessLogService: address: gloo-mesh-agent.gloo-mesh:9977 proxyMetadata: ISTIO_META_DNS_CAPTURE: "true" ISTIO_META_DNS_AUTO_ALLOCATE: "true" pilot: autoscaleEnabled: false replicaCount: 1 env: PILOT_SKIP_VALIDATE_TRUST_DOMAIN: "true" resources: requests: cpu: 10m memory: 2048Mi limits: cpu: 10m memory: 2048Mi EOF # Install istiod cluster1 helm upgrade --install istiod istio/istiod \ -f istiod-common-values.yaml \ --set global.meshID=mesh1 \ --set global.multiCluster.clusterName=$CLUSTER_1 \ --set meshConfig.trustDomain=$CLUSTER_1 \ --set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_1 \ --namespace istio-system \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_1 # Install istiod cluster2 helm upgrade --install istiod istio/istiod \ -f istiod-common-values.yaml \ --set global.meshID=mesh1 \ --set global.multiCluster.clusterName=$CLUSTER_2 \ --set meshConfig.trustDomain=$CLUSTER_2 \ --set meshConfig.defaultConfig.proxyMetadata.GLOO_MESH_CLUSTER_NAME=$CLUSTER_2 \ --namespace istio-system \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_2

Install ingress gateways:

 Bash cat << EOF > istio-ingress-common-values.yaml replicaCount: 1 autoscaling: enabled: false name: istio-ingressgateway securityContext: # runAsRoot runAsUser: 1337 runAsGroup: 1337 runAsNonRoot: true fsGroup: 1337 labels: istio: ingressgateway service: type: LoadBalancer ports: - port: 80 targetPort: 8080 name: http2 - port: 443 targetPort: 8443 name: https resources: limits: cpu: 10m memory: 128Mi requests: cpu: 10m memory: 128Mi EOF # Install Istio Ingress Gateway Cluster 1 helm upgrade --install istio-ingressgateway istio/gateway \ -f istio-ingress-common-values.yaml \ --namespace istio-gateways \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_1 \ --create-namespace # Install Istio Ingress Gateway Cluster 2 helm upgrade --install istio-ingressgateway istio/gateway \ -f istio-ingress-common-values.yaml \ --namespace istio-gateways \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_2 \ --create-namespace

Install east-west gateways:

 Bash cat << EOF > istio-eastwest-common-values.yaml replicaCount: 1 autoscaling: enabled: false name: istio-eastwestgateway securityContext: # runAsRoot runAsUser: 1337 runAsGroup: 1337 runAsNonRoot: true fsGroup: 1337 labels: istio: eastwestgateway service: type: LoadBalancer ports: - name: tcp-status-port port: 15021 targetPort: 15021 - name: tls port: 15443 targetPort: 15443 resources: requests: cpu: 10m memory: 128Mi limits: cpu: 10m memory: 128Mi EOF # Install Istio Eastwest Gateway Cluster 1 helm upgrade --install istio-eastwestgateway istio/gateway \ -f istio-eastwest-common-values.yaml \ --namespace istio-gateways \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_1 # Install Istio Eastwest Gateway Cluster 2 helm upgrade --install istio-eastwestgateway istio/gateway \ -f istio-eastwest-common-values.yaml \ --namespace istio-gateways \ --version $ISTIO_VERSION \ --kube-context $CLUSTER_2

Deploy Applications

In workload cluster 1:

 Bash kubectl --context ${CLUSTER_1} create ns bookinfo export bookinfo_yaml=https://raw.githubusercontent.com/istio/istio/1.11.4/samples/bookinfo/platform/kube/bookinfo.yaml kubectl --context ${CLUSTER_1} label namespace bookinfo istio-injection=enabled kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'app,version notin (v3)' -n bookinfo kubectl --context ${CLUSTER_1} apply -f ${bookinfo_yaml} -l 'account' -n bookinfo

And in workload cluster 2:

 Bash kubectl --context ${CLUSTER_2} create ns bookinfo kubectl --context ${CLUSTER_2} label namespace bookinfo istio-injection=enabled kubectl --context ${CLUSTER_2} apply -f ${bookinfo_yaml} -n bookinfo

Define your workspace (this is an abstraction given by Gloo Mesh to facilitate the organization of the workloads regardless the physical location):

 Bash kubectl apply --context $MGMT_CLUSTER -n gloo-mesh -f- <<EOF apiVersion: admin.gloo.solo.io/v2 kind: Workspace metadata: name: developers namespace: gloo-mesh spec: workloadClusters: - name: '*' namespaces: - name: '*' EOF kubectl apply --context $CLUSTER_1 -n gloo-mesh -f- <<EOF apiVersion: admin.gloo.solo.io/v2 kind: WorkspaceSettings metadata: name: developers namespace: gloo-mesh spec: options: serviceIsolation: enabled: false federation: enabled: false EOF

Expose the application:

 Bash kubectl --context ${CLUSTER_1} apply -f - <<EOF apiVersion: networking.gloo.solo.io/v2 kind: VirtualGateway metadata: name: north-south-gw namespace: istio-gateways spec: workloads: - selector: labels: istio: ingressgateway cluster: ${CLUSTER_1} listeners: - http: {} port: number: 80 allowedRouteTables: - host: '*' EOF kubectl --context ${CLUSTER_1} apply -f - <<EOF apiVersion: networking.gloo.solo.io/v2 kind: RouteTable metadata: name: productpage namespace: bookinfo labels: expose: "true" spec: hosts: - '*' virtualGateways: - name: north-south-gw namespace: istio-gateways cluster: ${CLUSTER_1} workloadSelectors: [] http: - name: productpage matchers: - uri: prefix: / forwardTo: destinations: - ref: name: productpage namespace: bookinfo port: number: 9080 EOF

Verify the Environment

Next, let’s create a bit of traffic and see what the UI displays. For that, port-forward the Gloo Mesh UI component:

 Bash export ENDPOINT_HTTP_GW_CLUSTER1=$(kubectl --context ${CLUSTER_1} -n istio-gateways get svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].*}'):80 for i in {0..100}; do curl -s -o /dev/null -w "%{http_code} " $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done

You should see:

 Bash ❯ for i in {0..100}; do curl -s -o /dev/null -w "%{http_code} " $ENDPOINT_HTTP_GW_CLUSTER1/productpage; done 200 200 200 200 200 200

Now, let port-forward the UI:

 Bash kubectl --context $MGMT_CLUSTER port-forward svc/gloo-mesh-ui -n gloo-mesh 8090

Go to: http://localhost:8090/ and you will see all the information about your clusters and your workspaces.

You can also see the amazing graph to help with understanding your own system: Observability

That is all! You have achieved full control of the network in a matter of minutes in your local environment.

Now, you can test any capability that Gloo Mesh offers, including:

Any kind of the policies that Gloo Mesh offers (such as failover, fault injection, outlier detection, retries, timeouts, traffic control, mirroring, rate limiting, and header and payload transformation)
Access control
Isolation of the services
WAF
Authentication with OIDC
Authorization with OPA

Tips for vcluster

Interested in learning more about vcluster? Here a simple diagram about how vcluster works:

In the workshop you have deployed three vclusters. If you run:

 Bash kubectl --context $MAIN_CONTEXT get sts -A

You will see:

 Bash NAMESPACE NAME READY AGE devmgmt devmgmt 1/1 3h7m devcluster2 devcluster2 1/1 3h2m devcluster1 devcluster1 1/1 3h3m

Each of these StatefulSets belong to one vcluster. In its attached volume is stored all the data regarding the deployed vcluster.

Getting closer, you will find that one of the containers of those StatefulSets is an entire k3s, a lightweight Kubernetes flavor. You could also use any of the supported kubernetes flavors: eks, k0s and vanilla k8s.

The other container is a syncer, an application which copies the pods that are created within the vcluster to the underlying host cluster. This is the reason you can see all the resources if you are the admin of the “host” cluster, and only your resources if you are the admin of the vcluster.

You can think of the StatefulSet like the control plane of a vcluster. This is the reason why you need to be careful how to deploy its pods.

Let’s see it in your just created environment. In your vcluster, you will see:

 Bash kubectl --context $MGMT_CLUSTER get pod -l app=gloo-mesh-mgmt-server -A NAMESPACE NAME READY STATUS gloo-mesh gloo-mesh-mgmt-server-9fb55d686-w4n4l 1/1 Running

But in the host cluster you will see:

 Bash kubectl --context $MAIN_CONTEXT get pod -A -l vcluster.loft.sh/namespace=gloo-mesh NAMESPACE NAME devcluster1 gloo-mesh-agent-df8c8c49d-jlhkh-x-gloo-mesh-x-devcluster1 devcluster2 gloo-mesh-agent-76b5b44b4f-56r5l-x-gloo-mesh-x-devcluster2 devmgmt gloo-mesh-mgmt-server-9fb55d686-w4n4l-x-gloo-mesh-x-devmgmt devmgmt gloo-mesh-redis-794d79b7df-rlr99-x-gloo-mesh-x-devmgmt devmgmt gloo-mesh-ui-cc98c5fc-tzq4s-x-gloo-mesh-x-devmgmt devmgmt prometheus-server-647b488bb-r6hfc-x-gloo-mesh-x-devmgmt

Check the names. That is the translation layer that vcluster makes for you.

There are a couple of things to keep in mind when working with vclusters:

Reserve resources enough for those StatefulSet pods: It is a good practice to have nodes with resources dedicated solely to these pods and make sure that the pods are deployed in those nodes. The intention is that the StatefulSet pods (vcluster control planes) will not run out of resources which would dramatically impact the performance of the vcluster. To do this, you can play with taints and nodeselectors in the nodes.

Logs and Kubernetes metadata: Log Aggregators tools like Fluentbit and Grafana Promtail rely on the Kubernetes structure and naming convention. Log folders and files follow the kubernetes structure given by the host cluster.

From the command above, you could see that the same pod has different names in vcluster and in the host. Therefore, if you deploy one of the observability tools mentioned before in the vcluster, the expected structures will not match the one in the host cluster.The consequence is that the vcluster will not be able to leverage the Kubernetes metadata, nor the log traces from the applications in that cluster. This issue is currently being addressed by the Loft Labs team at the time of writing this post.

The last interesting point to mention is the capability to pause/resume individual vcluster (StatefulSets). In case you do not want to destroy the entire environment created in the workshop you can just do:

 Bash vcluster pause $MGMT_CLUSTER -n $MGMT_CLUSTER --context $MAIN_CONTEXT vcluster pause $CLUSTER_1 -n $CLUSTER_1 --context $MAIN_CONTEXT vcluster pause $CLUSTER_2 -n $CLUSTER_2 --context $MAIN_CONTEXT

And whenever you want to keep working on the tests you can do:

 Bash vcluster resume $MGMT_CLUSTER -n $MGMT_CLUSTER --context $MAIN_CONTEXT vcluster resume $CLUSTER_1 -n $CLUSTER_1 --context $MAIN_CONTEXT vcluster resume $CLUSTER_2 -n $CLUSTER_2 --context $MAIN_CONTEXT

Conclusions

Technology changes fast. Not many years ago, we were working with monoliths. Nowadays, you can have clusters deployed within another clusters.

Through this workshop, you were able to:

Deploy all the components of Gloo Mesh in your local environment or in a cheap remote environment.
Basic setup to test all Gloo Mesh capabilities to handle east-west and north-south traffic between your services.
Reduce cost of deploying multiple clusters with vcluster. You just need one actual cluster.
Reduce time of testing things out in a local environment.

This increases exponentially the efficiency in your projects. Which, at the end, is translated into an increase in productivity.

As a final comment, you can see that being able to test things in your local environment, reproducing heavy remote environments, is one of the goals of the DevOps practices.

If you want to talk more about all these tools, you can find me easily in these Slack workspaces: solo.io, istio and loft.sh

DEV Community