DEV Community

Cover image for Kubernetes Use Case: Deploying and Managing a Scalable Web Application
Avesh
Avesh

Posted on

Kubernetes Use Case: Deploying and Managing a Scalable Web Application

Introduction to Kubernetes

Kubernetes (often abbreviated as K8s) is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. Developed by Google and now maintained by the Cloud Native Computing Foundation, Kubernetes is widely used for cloud-native application development and is highly effective in managing applications that require high availability, scalability, and fault tolerance.

In this article, we’ll explore a real-world use case for Kubernetes by setting up a scalable web application. We'll go through step-by-step instructions for deploying and managing this application on Kubernetes.

Use Case Scenario: Scalable Web Application

Consider a scenario where we have a web application that experiences high traffic. We need the application to be available at all times, able to handle scaling dynamically based on demand, and have failover capabilities to recover from unexpected failures.

Requirements:

  1. Scalability: The application must scale out (add more instances) or scale in (reduce instances) based on demand.
  2. Load Balancing: Incoming traffic should be evenly distributed across all instances of the application.
  3. Resilience: The application should be able to self-heal, automatically replacing any failed instances.

In this example, we’ll deploy a simple Node.js web application on Kubernetes and use Kubernetes features like Deployments, Services, and Horizontal Pod Autoscalers to fulfill these requirements.

Kubernetes Components Used

  1. Pods: The smallest unit in Kubernetes, representing one or more containers.
  2. Deployment: Defines the desired state and manages the number of replicas for our application.
  3. Service: Exposes our application and balances the load across pods.
  4. Horizontal Pod Autoscaler (HPA): Automatically scales the number of pod replicas based on CPU or memory usage.

Example Architecture

  • Node.js Web Application: A simple HTTP server that returns a "Hello, World!" message.
  • Nginx Ingress: A load balancer to distribute requests.
  • Kubernetes Cluster: Running locally (using Minikube) or in the cloud (e.g., Google Kubernetes Engine, AWS EKS).

Step-by-Step Implementation

1. Setting Up Kubernetes Environment

If you don’t have a Kubernetes cluster set up, you can use Minikube for local development or a managed Kubernetes service (like GKE or EKS) for production-grade deployments. Here’s how to set up Minikube:

# Install Minikube curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 sudo install minikube-linux-amd64 /usr/local/bin/minikube # Start Minikube minikube start 
Enter fullscreen mode Exit fullscreen mode

Verify that the cluster is running:

kubectl get nodes 
Enter fullscreen mode Exit fullscreen mode

2. Creating the Node.js Application

For demonstration, we’ll use a simple Node.js application that returns "Hello, World!" when accessed.

// app.js const http = require('http'); const PORT = process.env.PORT || 3000; const requestHandler = (req, res) => { res.end('Hello, World!'); }; const server = http.createServer(requestHandler); server.listen(PORT, () => { console.log(`Server running on port ${PORT}`); }); 
Enter fullscreen mode Exit fullscreen mode

Create a Dockerfile to containerize the application:

# Dockerfile FROM node:14-alpine WORKDIR /app COPY app.js . CMD ["node", "app.js"] 
Enter fullscreen mode Exit fullscreen mode

Build and push the Docker image:

docker build -t <your_dockerhub_username>/node-app:v1 . docker push <your_dockerhub_username>/node-app:v1 
Enter fullscreen mode Exit fullscreen mode

3. Creating Kubernetes Deployment and Service

Define a Deployment YAML file (deployment.yaml) for the Node.js app:

# deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: node-app spec: replicas: 2 selector: matchLabels: app: node-app template: metadata: labels: app: node-app spec: containers: - name: node-container image: <your_dockerhub_username>/node-app:v1 ports: - containerPort: 3000 
Enter fullscreen mode Exit fullscreen mode

Create a Service YAML file (service.yaml) to expose the application within the Kubernetes cluster:

# service.yaml apiVersion: v1 kind: Service metadata: name: node-app-service spec: selector: app: node-app ports: - protocol: TCP port: 80 targetPort: 3000 type: LoadBalancer 
Enter fullscreen mode Exit fullscreen mode

Apply these configurations:

kubectl apply -f deployment.yaml kubectl apply -f service.yaml 
Enter fullscreen mode Exit fullscreen mode

4. Exposing the Service

To expose the service, you can use minikube service (for Minikube):

minikube service node-app-service 
Enter fullscreen mode Exit fullscreen mode

Or, in a managed Kubernetes cluster, you’d configure Ingress or a load balancer to expose the application.

5. Setting Up Auto-Scaling

Define a Horizontal Pod Autoscaler (HPA) that automatically adjusts the number of pods based on CPU usage. Create a file (hpa.yaml):

# hpa.yaml apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: node-app-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: node-app minReplicas: 2 maxReplicas: 10 targetCPUUtilizationPercentage: 50 
Enter fullscreen mode Exit fullscreen mode

Apply the HPA configuration:

kubectl apply -f hpa.yaml 
Enter fullscreen mode Exit fullscreen mode

The HPA will monitor CPU usage. If usage exceeds 50%, the number of replicas will increase; if usage falls, the replicas will decrease, down to a minimum of two.

6. Testing the Deployment

  1. Verify Deployment: Check the status of your pods and services.
 kubectl get pods kubectl get services 
Enter fullscreen mode Exit fullscreen mode
  1. Generate Load for Scaling: Simulate load to trigger the HPA.
 kubectl run -i --tty load-generator --image=busybox /bin/sh # Inside the load generator shell, use: while true; do wget -q -O- http://<service-ip>; done 
Enter fullscreen mode Exit fullscreen mode

The HPA should scale up additional pods if the CPU usage threshold is reached.

  1. Monitor Scaling: Observe the scaling activity.
 kubectl get hpa kubectl get pods -w 
Enter fullscreen mode Exit fullscreen mode

Conclusion

This example illustrates how Kubernetes enables us to deploy a scalable, highly available web application with minimal configuration and management overhead. With just a few resource definitions, we can:

  • Automatically scale our application to handle increased load.
  • Load balance requests among multiple instances.
  • Ensure high availability and fault tolerance through Kubernetes’ self-healing capabilities.

By applying this approach to larger, more complex applications, teams can improve operational efficiency and ensure that applications are resilient and responsive to changing demands.

Top comments (0)