Best Practices For Microservices on Kubernetes
There are several best practices for building microservices architecture properly. You may find many articles about it online. One of them is my previous article Spring Boot Best Practices For Microservices. I focused there on the most important aspects that should be considered when running microservice applications built on top of Spring Boot on production. I didn’t assume there is any platform used for orchestration or management, but just a group of independent applications. In this article, I’m going to extend the list of already introduced best practices with some new rules dedicated especially to microservices deployed on the Kubernetes platform.
The first question is if it makes any difference when you deploy your microservices on Kubernetes instead of running them independently without any platform? Well, actually yes and no… Yes, because now you have a platform that is responsible for running and monitoring your applications, and it launches some of its own rules. No, because you still have microservices architecture, a group of loosely coupled, independent applications, and you should not forget about it! In fact, many of the previously introduced best practices are actual, some of them need to be redefined a little. There are also some new, platform-specific rules, which should be mentioned.
One thing that needs to be explained before proceeding. This list of Kubernetes microservices best practices is built based on my experience in running microservices-based architecture on cloud platforms like Kubernetes. I didn’t copy it from other articles or books. In my organization, we have already migrated our microservices from Spring Cloud (Eureka, Zuul, Spring Cloud Config) to OpenShift. We are continuously improving this architecture based on experience in maintaining it.
Example
The sample Spring Boot application that implements currently described Kubernetes microservices best practices is written in Kotlin. It is available on GitHub in repository sample-spring-kotlin-microservice under branch kubernetes: https://github.com/piomin/sample-spring-kotlin-microservice/tree/kubernetes.
1. Allow platform to collect metrics
I have also put a similar section in my article about best practices for Spring Boot. However, metrics are also one of the important Kubernetes microservices best practices. We were using InfluxDB as a target metrics store. Since our approach to gathering metrics data is being changed after migration to Kubernetes I redefined the title of this point into Allow platform to collect metrics. The main difference between current and previous approaches is in the way of collecting data. We use Prometheus, because that process may be managed by the platform. InfluxDB is a push-based system, where your application actively pushes data into the monitoring system. Prometheus is a pull-based system, where a server fetches the metrics values from the running application periodically. So, our main responsibility at this point is to provide endpoints on the application side for Prometheus.
Fortunately, it is very easy to provide metrics for Prometheus with Spring Boot. You need to include Spring Boot Actuator and a dedicated Micrometer library for integration with Prometheus.
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId> </dependency> We should also enable exposing Actuator HTTP endpoints outside application. You can enable a single endpoint dedicated for Prometheus or just expose all Actuator endpoints as shown below.
management.endpoints.web.exposure.include: '*' After running your application endpoint is by default available under path /actuator/prometheus.

Assuming you run your application on Kubernetes you need to deploy and configure Prometheus to scrape logs from your pods. The configuration may be delivered as Kubernetes ConfigMap. The prometheus.yml file should contain section scrape_config with path of endpoint serving metrics and Kubernetes discovery settings. Prometheus is trying to localize all application pods by Kubernetes Endpoints. The application should be labeled with app=sample-spring-kotlin-microservice and have a port with name http exposed outside the container.
apiVersion: v1 kind: ConfigMap metadata: name: prometheus labels: name: prometheus data: prometheus.yml: |- scrape_configs: - job_name: 'springboot' metrics_path: /actuator/prometheus scrape_interval: 5s kubernetes_sd_configs: - role: endpoints namespaces: names: - default relabel_configs: - source_labels: [__meta_kubernetes_service_label_app] separator: ; regex: sample-spring-kotlin-microservice replacement: $1 action: keep - source_labels: [__meta_kubernetes_endpoint_port_name] separator: ; regex: http replacement: $1 action: keep - source_labels: [__meta_kubernetes_namespace] separator: ; regex: (.*) target_label: namespace replacement: $1 action: replace - source_labels: [__meta_kubernetes_pod_name] separator: ; regex: (.*) target_label: pod replacement: $1 action: replace - source_labels: [__meta_kubernetes_service_name] separator: ; regex: (.*) target_label: service replacement: $1 action: replace - source_labels: [__meta_kubernetes_service_name] separator: ; regex: (.*) target_label: job replacement: ${1} action: replace - separator: ; regex: (.*) target_label: endpoint replacement: http action: replace The last step is to deploy Prometheus on Kubernetes. You should attach ConfigMap with Prometheus configuration to the Deployment via Kubernetes mounted volume. After that you may set the location of a configuration file using config.file parameter: --config.file=/prometheus2/prometheus.yml.
apiVersion: apps/v1 kind: Deployment metadata: name: prometheus labels: app: prometheus spec: replicas: 1 selector: matchLabels: app: prometheus template: metadata: labels: app: prometheus spec: containers: - name: prometheus image: prom/prometheus:latest args: - "--config.file=/prometheus2/prometheus.yml" - "--storage.tsdb.path=/prometheus/" ports: - containerPort: 9090 name: http volumeMounts: - name: prometheus-storage-volume mountPath: /prometheus/ - name: prometheus-config-map mountPath: /prometheus2/ volumes: - name: prometheus-storage-volume emptyDir: {} - name: prometheus-config-map configMap: name: prometheus Now you can verify if Prometheus has discovered your application running on Kubernetes by accessing endpoint /targets.

2. Prepare logs in right format
The approach to collecting logs is pretty similar to collecting metrics. Our application should not handle the process of sending logs by itself. It just should take care of formatting logs sent to the output stream properly. Since Docker has a built-in logging driver for Fluentd it is very convenient to use it as a log collector for applications running on Kubernetes. This means no additional agent is required on the container to push logs to Fluentd. Logs are directly shipped to Fluentd service from STDOUT and no additional logs file or persistent storage is required. Fluentd tries to structure data as JSON to unify logging across different sources and destinations.
In order to format our logs to JSON readable by Fluentd we may include the Logstash Logback Encoder library to our dependencies.
<dependency> <groupId>net.logstash.logback</groupId> <artifactId>logstash-logback-encoder</artifactId> <version>6.3</version> </dependency> Then we just need to set a default console log appender for our Spring Boot application in the file logback-spring.xml.
<configuration> <appender name="consoleAppender" class="ch.qos.logback.core.ConsoleAppender"> <encoder class="net.logstash.logback.encoder.LogstashEncoder"/> </appender> <logger name="jsonLogger" additivity="false" level="DEBUG"> <appender-ref ref="consoleAppender"/> </logger> <root level="INFO"> <appender-ref ref="consoleAppender"/> </root> </configuration> The logs are printed into STDOUT in the format visible below.

It is very simple to install Fluentd, Elasticsearch and Kibana on Minikube. Disadvantage of this approach is that we are installing older versions of these tools.
$ minikube addons enable efk * efk was successfully enabled $ minikube addons enable logviewer * logviewer was successfully enabled After enabling efk and logviewer addons Kubernetes pulls and starts all the required pods as shown below.

Thanks to the logstash-logback-encoder library we may automatically create logs compatible with Fluentd including MDC fields. Here’s a screen from Kibana that shows logs from our test application.

Optionally, you can add my library for logging requests/responses for Spring Boot application.
<dependency> <groupId>com.github.piomin</groupId> <artifactId>logstash-logging-spring-boot-starter</artifactId> <version>1.2.2.RELEASE</version> </dependency> 3. Implement both liveness and readiness health check
It is important to understand the difference between liveness and readiness probes in Kubernetes. If these probes are not implemented carefully, they can degrade the overall operation of a service, for example by causing unnecessary restarts. Liveness probe is used to decide whether to restart the container or not. If an application is unavailable for any reason, restarting the container sometimes can make sense. On the other hand, a readiness probe is used to decide if a container can handle incoming traffic. If a pod has been recognized as not ready, it is removed from load balancing. Fail of the readiness probe does not result in pod restart. The most typical liveness or readiness probe for web applications is realized via HTTP endpoint.
In a typical web application running outside a platform like Kubernetes, you won’t distinguish liveness and readiness health checks. That’s why most web frameworks provide only a single built-in health check implementation. For Spring Boot application you may easily enable health check by including Spring Boot Actuator to your dependencies. The important information about the Actuator health check is that it may behave differently depending on integrations between your application and third-party systems. For example, if you define a Spring data source for connecting to a database or declare a connection to the message broker, a health check may automatically include such validation through auto-configuration. Therefore, if you set a default Spring Actuator health check implementation as a liveness probe endpoint, it may result in unnecessary restart if the application is unable to connect the database or message broker. Since such behavior is not desired, I suggest you should implement a very simple liveness endpoint, that just verifies the availability of application without checking connection to other external systems.
Adding a custom implementation of a health check is not very hard with Spring Boot. There are some different ways to do that. One of them is visible below. We are using the mechanism provided within Spring Boot Actuator. It is worth noting that we won’t override a default health check, but we are adding another, custom implementation. The following implementation is just checking if an application is able to handle incoming requests.
@Component @Endpoint(id = "liveness") class LivenessHealthEndpoint { @ReadOperation fun health() : Health = Health.up().build() @ReadOperation fun name(@Selector name: String) : String = "liveness" @WriteOperation fun write(@Selector name: String) { } @DeleteOperation fun delete(@Selector name: String) { } } In turn, a default Spring Boot Actuator health check may be the right solution for a readiness probe. Assuming your application would connect to database Postgres and RabbitMQ message broker you should add the following dependencies to your Maven pom.xml.
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-amqp</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <dependency> <groupId>org.postgresql</groupId> <artifactId>postgresql</artifactId> <scope>runtime</scope> </dependency> Now, just for information add the following property to your application.yml. It enables displaying detailed information for auto-configured Actuator /health endpoint.
management: endpoint: health: show-details: always Finally, let’s call /actuator/health to see the detailed result. As you see in the picture below, a health check returns information about Postgres and RabbitMQ connections.

There is another aspect of using liveness and readiness probes in your web application. It is related to thread pooling. In a standard web container like Tomcat, each request is handled by the HTTP thread pool. If you are processing each request in the main thread and you have some long-running tasks in your application you may block all available HTTP threads. If your liveness will fail several times in row an application pod will be restarted. Therefore, you should consider implementing long-running tasks using another thread pool. Here’s the example of HTTP endpoint implementation with DeferredResult and Kotlin coroutines.
@PostMapping("/long-running") fun addLongRunning(@RequestBody person: Person): DeferredResult<Person> { var result: DeferredResult<Person> = DeferredResult() GlobalScope.launch { logger.info("Person long-running: {}", person) delay(10000L) result.setResult(repository.save(person)) } return result } 4. Consider your integrations
Hardly ever our application is able to exist without any external solutions like databases, message brokers or just other applications. There are two aspects of integration with third-party applications that should be carefully considered: connection settings and auto-creation of resources.
Let’s start with connection settings. As you probably remember, in the previous section we were using the default implementation of Spring Boot Actuator /health endpoint as a readiness probe. However, if you leave default connection settings for Postgres and Rabbit each call of the readiness probe takes a long time if they are unavailable. That’s why I suggest decreasing these timeouts to lower values as shown below.
spring: application: name: sample-spring-kotlin-microservice datasource: url: jdbc:postgresql://postgres:5432/postgres username: postgres password: postgres123 hikari: connection-timeout: 2000 initialization-fail-timeout: 0 jpa: database-platform: org.hibernate.dialect.PostgreSQLDialect rabbitmq: host: rabbitmq port: 5672 connection-timeout: 2000 Except properly configured connection timeouts you should also guarantee auto-creation of resources required by the application. For example, if you use RabbitMQ queue for asynchronous messaging between two applications you should guarantee that the queue is created on startup if it does not exist. To do that first declare a queue – usually on the listener side.
@Configuration class RabbitMQConfig { @Bean fun myQueue(): Queue { return Queue("myQueue", false) } } Here’s a listener bean with receiving method implementation.
@Component class PersonListener { val logger: Logger = LoggerFactory.getLogger(PersonListener::class.java) @RabbitListener(queues = ["myQueue"]) fun listen(msg: String) { logger.info("Received: {}", msg) } } The similar case is with database integration. First, you should ensure that your application starts even if the connection to the database fails. That’s why I declared PostgreSQLDialect. It is required if the application is not able to connect to the database. Moreover, each change in the entities model should be applied on tables before application startup.
Fortunately, Spring Boot has auto-configured support for popular tools for managing database schema changes: Liquibase and Flyway. To enable Liquibase we just need to include the following dependency in Maven pom.xml.
<dependency> <groupId>org.liquibase</groupId> <artifactId>liquibase-core</artifactId> </dependency> Then you just need to create a change log and put in the default location db/changelog/db.changelog-master.yaml. Here’s a sample Liquibase changelog YAML file for creating table person.
databaseChangeLog: - changeSet: id: 1 author: piomin changes: - createTable: tableName: person columns: - column: name: id type: int autoIncrement: true constraints: primaryKey: true nullable: false - column: name: name type: varchar(50) constraints: nullable: false - column: name: age type: int constraints: nullable: false - column: name: gender type: smallint constraints: nullable: false 5. Use Service Mesh
If you are building microservices architecture outside Kubernetes, such mechanisms like load balancing, circuit breaking, fallback, or retrying are realized on the application side. Popular cloud-native frameworks like Spring Cloud simplify the implementation of these patterns in your application and just reduce it to adding a dedicated library to your project. However, if you migrate your microservices to Kubernetes you should not still use these libraries for traffic management. It is becoming some kind of anti-pattern. Traffic management in communication between microservices should be delegated to the platform. This approach on Kubernetes is known as Service Mesh. One of the most important Kubernetes microservices best practices is to use dedicated software for building a service mesh.
Since originally Kubernetes has not been dedicated to microservices, it does not provide any built-in mechanism for advanced managing of traffic between many applications. However, there are some additional solutions dedicated for traffic management, which may be easily installed on Kubernetes. One of the most popular of them is Istio. Besides traffic management it also solves problems related to security, monitoring, tracing and metrics collecting.
Istio can be easily installed on your cluster or on standalone development instances like Minikube. After downloading it just run the following command.
$ istioctl manifest apply Istio components need to be injected into a deployment manifest. After that, we can define traffic rules using YAML manifests. Istio gives many interesting options for configuration. The following example shows how to inject faults into the existing route. It can be either delays or aborts. We can define a percentage level of error using the percent field for both types of fault. In the Istio resource I have defined a 2 seconds delay for every single request sent to Service account-service.
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: account-service spec: hosts: - account-service http: - fault: delay: fixedDelay: 2s percent: 100 route: - destination: host: account-service subset: v1 Besides VirtualService we also need to define DestinationRule for account-service. It is really simple – we have just defined the version label of the target service.
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: account-service spec: host: account-service subsets: - name: v1 labels: version: v1 6. Be open for framework-specific solutions
There are many interesting tools and solutions around Kubernetes, which may help you in running and managing applications. However, you should also not forget about some interesting tools and solutions offered by a framework you use. Let me give you some examples. One of them is the Spring Boot Admin. It is a useful tool designed for monitoring Spring Boot applications across a single discovery. Assuming you are running microservices on Kubernetes you may also install Spring Boot Admin there.
There is another interesting project within Spring Cloud – Spring Cloud Kubernetes. It provides some useful features that simplify integration between a Spring Boot application and Kubernetes. One of them is a discovery across all namespaces. If you use that feature together with Spring Boot Admin, you may easily create a powerful tool, which is able to monitor all Spring Boot microservices running on your Kubernetes cluster. For more details about implementation details you may refer to my article Spring Boot Admin on Kubernetes.
Sometimes you may use Spring Boot integrations with third-party tools to easily deploy such a solution on Kubernetes without building separated Deployment. You can even build a cluster of multiple instances. This approach may be used for products that can be embedded in a Spring Boot application. It can be, for example RabbitMQ or Hazelcast (popular in-memory data grid). If you are interested in more details about running Hazelcast cluster on Kubernetes using this approach please refer to my article Hazelcast with Spring Boot on Kubernetes.
7. Be prepared for a rollback
Kubernetes provides a convenient way to rollback an application to an older version based on ReplicaSet and Deployment objects. By default, Kubernetes keeps 10 previous ReplicaSets and lets you roll back to any of them. However, one thing needs to be pointed out. A rollback does not include configuration stored inside ConfigMap and Secret. Sometimes it is desired to rollback not only application binaries, but also configuration.
Fortunately, Spring Boot gives us really great possibilities for managing externalized configuration. We may keep configuration files inside the application and also load them from an external location. On Kubernetes we may use ConfigMap and Secret for defining Spring configuration files. The following definition of ConfigMap creates application-rollbacktest.yml Spring configuration containing only a single property. This configuration is loaded by the application only if Spring profile rollbacktest is active.
apiVersion: v1 kind: ConfigMap metadata: name: sample-spring-kotlin-microservice data: application-rollbacktest.yml: |- property1: 123456 A ConfigMap is included to the application through a mounted volume.
spec: containers: - name: sample-spring-kotlin-microservice image: piomin/sample-spring-kotlin-microservice ports: - containerPort: 8080 name: http volumeMounts: - name: config-map-volume mountPath: /config/ volumes: - name: config-map-volume configMap: name: sample-spring-kotlin-microservice We also have application.yml on the classpath. The first version contains only a single property.
property1: 123 In the second we are going to activate the rollbacktest profile. Since, a profile-specific configuration file has higher priority than application.yml, the value of property1 property is overridden with value taken from application-rollbacktest.yml.
property1: 123 spring.profiles.active: rollbacktest Let’s test the mechanism using a simple HTTP endpoint that prints the value of the property.
@RestController @RequestMapping("/properties") class TestPropertyController(@Value("\${property1}") val property1: String) { @GetMapping fun printProperty1(): String = property1 } Let’s take a look how we are rolling back a version of deployment. First, let’s see how many revisions we have.
$ kubectl rollout history deployment/sample-spring-kotlin-microservice deployment.apps/sample-spring-kotlin-microservice REVISION CHANGE-CAUSE 1 2 3 Now, we are calling endpoint /properties of current deployment, that returns value of property property1. Since, profile rollbacktest is active it returns value from file application-rollbacktest.yml.
$ curl http://localhost:8080/properties 123456 Let’s roll back to the previous revision.
$ kubectl rollout undo deployment/sample-spring-kotlin-microservice --to-revision=2 deployment.apps/sample-spring-kotlin-microservice rolled back As you see below the revision=2 is not visible, it is now deployed as the newest revision=4.
$ kubectl rollout history deployment/sample-spring-kotlin-microservice deployment.apps/sample-spring-kotlin-microservice REVISION CHANGE-CAUSE 1 3 4 In this version of application profile rollbacktest wasn’t active, so value of property property1 is taken from application.yml.
$ curl http://localhost:8080/properties 123
1 COMMENT