Skip to content
This repository was archived by the owner on Nov 13, 2023. It is now read-only.

A step-by-step guide how to configure AKS auto-scaling for GitHub self-hosted runners on Azure

Notifications You must be signed in to change notification settings

nicklegan/aks-auto-scaling-github-self-hosted-runners

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

AKS auto-scaling for GitHub self-hosted runners on Azure

This documentation provides a step-by-step guide how to configure AKS auto-scaling for GitHub self-hosted runners. This is established by configuring the actions-runner-controller.

The below auto-scaling guide consists of the following self-hosted runner specification:

  • Optimized for Azure Kubernetes Service
  • Compatible with GitHub Server and Cloud
  • Organization-level runners
  • Ephemeral runners
  • Auto-scaling with workflow_job webhooks
  • Webhook secret
  • Ingress TLS termination
  • Auto-provisioning Let's Encrypt SSL certificate
  • GitHub App API authentication

Table of Content

Prerequisites

Reference architecture

reference-architecture

Setup AKS cluster

# Install Azure CLI - https://docs.microsoft.com/en-us/cli/azure/install-azure-cli az login # Install kubectl - https://docs.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az_aks_install_cli az aks install-cli # Create resource group az group create -n <your-resource-group> --location <your-location> # Create AKS cluster az aks create -n <your-cluster-name> -g <your-resource-group> --node-resource-group <your-node-resource-group-name> --enable-managed-identity # Get AKS access credentials az aks get-credentials -n <your-cluster-name> -g <your-resource-group>

Setup Helm client

# Install Helm - https://helm.sh/docs/intro/install/ brew install helm # macOS choco install kubernetes-helm # Windows sudo snap install helm --classic # Debian/Ubuntu

Add cert-manager and NGINX ingress repositories

# Add repositories helm repo add jetstack https://charts.jetstack.io helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx # Update repositories helm repo update

Install cert-manager

# Install cert-manager - https://cert-manager.io/docs/installation/helm/ helm install --wait --create-namespace --namespace cert-manager cert-manager jetstack/cert-manager --version v1.6.1 --set installCRDs=true

Apply Let's Encrypt ClusterIssuer config for cert-manager

kubectl apply -f clusterissuer.yaml

clusterissuer.yaml

  • email:
apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: letsencrypt-prod namespace: cert-manager spec: acme: # The ACME server URL server: https://acme-v02.api.letsencrypt.org/directory # Email address used for ACME registration email: your-email@address.com # Name of a secret used to store the ACME account private key privateKeySecretRef: name: letsencrypt-prod # Enable the HTTP-01 challenge provider solvers: - http01: ingress: class: nginx

Install NGINX ingress controller

# Install NGINX Ingress controller helm install ingress-nginx ingress-nginx/ingress-nginx --namespace actions-runner-system --create-namespace # Retrieve public load balancer IP from ingress controller kubectl -n actions-runner-system get svc

Setup domain A record

Navigate to your domain registrar and create a new A record linking the above ingress load balancer IP to your TLD as a subdomain. e.g. webhook.tld.com

Create a GitHub App and configure GitHub App authenthication

Configure workflow_job webhooks

  • Activate the GitHub App webhook feature and add your earlier created domain A record as a Webhook URL
  • Navigate to permissions & events and enable webhook workflow job events

Generate and set a GitHub App webhook secret

Prepare a webhook secret for use in the values.yaml file github_webhook_secret_token and configure the same webhook secret in the created GitHub App

# Generate random webhook secret ruby -rsecurerandom -e 'puts SecureRandom.hex(20)'

Prepare Actions Runner Controller configuration

Modify the default values.yaml with your custom values like specified below

# Configure values.yaml vim values.yaml

values.yaml

  • githubEnterpriseServerURL: only needed when using GHES
  • authSecret:
  • githubWebhookServer:
    • ingress:
    • github_webhook_secret_token
# The URL of your GitHub Enterprise server, if you're using one. githubEnterpriseServerURL: https://github.example.com # Only 1 authentication method can be deployed at a time # Uncomment the configuration you are applying and fill in the details authSecret: create: true name: "controller-manager" annotations: {} ### GitHub Apps Configuration ## NOTE: IDs MUST be strings, use quotes github_app_id: "3" github_app_installation_id: "1" github_app_private_key: |-  -----BEGIN RSA PRIVATE KEY-----  MIIEogIBAAKCAQEA2zl6z+uMcS4D+D9f1ENLJY2w/9lLPajs/wA2gnt74/7bcB1f  0000000000000000000000000000000000000000000000000000000000000000  0000000000000000000000000000000000000000000000000000000000000000  0000000000000000000000000000000000000000000000000000000000000000  0000000000000000000000000000000000000000000000000000000000000000  2x/9kVAWKQ2UJGxqupGqV14vLaNpmA2uILBxc5jKXHu1nNkgUwU=  -----END RSA PRIVATE KEY-----  ### GitHub PAT Configuration #github_token: ""
githubWebhookServer: enabled: true replicaCount: 1 syncPeriod: 10m secret: create: false name: "github-webhook-server" ### GitHub Webhook Configuration github_webhook_secret_token: "" imagePullSecrets: [] nameOverride: "" fullnameOverride: "" serviceAccount: # Specifies whether a service account should be created create: true # Annotations to add to the service account annotations: {} # The name of the service account to use. # If not set and create is true, a name is generated using the fullname template name: "" podAnnotations: {} podLabels: {} podSecurityContext: {} # fsGroup: 2000 securityContext: {} resources: {} nodeSelector: {} tolerations: [] affinity: {} priorityClassName: "" service: type: ClusterIP annotations: {} ports: - port: 80 targetPort: http protocol: TCP name: http #nodePort: someFixedPortForUseWithTerraformCdkCfnEtc ingress: enabled: true annotations: kubernetes.io/ingress.class: nginx cert-manager.io/cluster-issuer: "letsencrypt-prod" hosts: - host: webhook.tld.com paths: - path: / tls: - secretName: letsencrypt-prod hosts: - webhook.tld.com

Install Actions Runner Controller

# Install actions-runner-controller helm upgrade --install -f values.yaml --wait --namespace actions-runner-system actions-runner-controller actions-runner-controller/actions-runner-controller

Verify installation and SSL certificate

# View all namespace resources kubectl --namespace actions-runner-system get all # Verify certificaterequest status kubectl get certificaterequest --namespace actions-runner-system # Verify certificate status kubectl describe certificate letsencrypt --namespace actions-runner-system # Verify if SSL certificate is working properly curl -v --connect-to webhook.tld.com https://webhook.tld.com

Deploy runner manifest

# Create a new namespace kubectl create namespace self-hosted-runners # Edit runnerdeployment yaml vim runnerdeployment.yaml # Apply runnerdeployment manifest kubectl apply -f runnerdeployment.yaml

runnerdeployment.yaml

The below manifest deploys organization-level auto-scaling ephemeral runners, using a minimal keep-alive configuration of 1 runner. Runners are scaled up to 5 active replicas based on incoming workflow_job webhook events. Scaling them back down to 1 runner by idle timeout of 5 minutes

  • organization:
apiVersion: actions.summerwind.dev/v1alpha1 kind: RunnerDeployment metadata: name: org-runner namespace: self-hosted-runners spec: template: metadata: labels: app: org-runner spec: organization: your-github-organization labels: - self-hosted ephemeral: true --- apiVersion: actions.summerwind.dev/v1alpha1 kind: HorizontalRunnerAutoscaler metadata: name: org-runner namespace: self-hosted-runners spec: scaleTargetRef: name: org-runner scaleUpTriggers: - githubEvent: {} amount: 1 duration: "5m" minReplicas: 1 maxReplicas: 5

Verify status of runners and pods

# List running pods kubectl get pods -n self-hosted-runners # List active runners kubectl get runners -n self-hosted-runners

Verify deployment of all cluster services

kubectl get all -A 

Resources

About

A step-by-step guide how to configure AKS auto-scaling for GitHub self-hosted runners on Azure

Topics

Resources

Stars

Watchers

Forks