Infrastructure as Code (IaC) has transformed how you deploy and manage cloud infrastructure. Tools like Azure Resource Manager, AWS CloudFormation, Docker, Kubernetes, Ansible, and Terraform have made deploying infrastructure faster and more scalable. However, they’ve also introduced a new set of security challenges.
In recent years, there have been numerous security incidents caused by IaC misconfigurations. These include:
- Open Ports and Weak Security Groups: Exposing cloud resources to the public internet.
- Hardcoded Secrets: Credentials, API keys, or sensitive data embedded in code.
- Overly Permissive Policies: Granting unnecessary privileges to users or services.
- Missing Resource Limits: Allowing unrestricted use of cloud resources, leading to potential outages or abuse.
The consequences of these issues can be severe, from data breaches to financial loss and reputational damage. Moreover when Code GenAI is used in order to produce those artifacts.
Code GenAI is a great help to start code artifacts and produce boilerplate code, but it also needs to be reviewed to avoid the introduction of unexpected issues and vulnerabilities.
Fortunately, there are tools that can help identify critical vulnerabilities early in development.
From the SonarQube Cloud telemetry, I've gathered the most hit issues regarding IaC, with more than 6 million hits in total across all projects analyzed.
In this article, I focus on Azure, CloudFormation, Docker, Kubernetes, Ansible and Terraform as examples of IaC issues. I highlight each critical issue, its risks, and how to fix it.
As a bonus chapter, you can see the result of an experiment with Code GenAI using different providers to generate Kubernetes artifacts and check if they are as clean and secure as we expect.
Let's start with a list of examples of all the IaC artifacts covered in this article (and supported by SonarQube).
Azure Resource Manager
Restrict Public Access to Resources
Problem: Allowing unrestricted public access to Azure resources (e.g., Blob Storage) exposes them to unauthorized users.
Solution: Using publicNetworkAccess to control access to resources.
{ "type": "Microsoft.Web/sites", "apiVersion": "2020-12-01", "name": "example-site", "properties": { "siteConfig": { "publicNetworkAccess": "Disabled" } } } AWS CloudFormation
1. Ensure S3 Buckets Are Private
Problem: Publicly accessible S3 buckets can lead to data leaks.
Solution: Set the bucket's AccessControl to Private.
Resources: MyBucket: Type: AWS::S3::Bucket Properties: AccessControl: Private 2. Apply Least Privilege to IAM Roles
Problem: Granting broad permissions creates unnecessary security risks.
Solution: Limit actions and resources to only what’s required.
Resources: # Update Lambda code lambdaUpdatePolicy: Type: AWS::IAM::ManagedPolicy Properties: ManagedPolicyName: lambdaUpdatePolicy PolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Action: - lambda:UpdateFunctionCode Resource: "arn:aws:lambda:us-east-2:123456789012:function:my-function:1" Docker
Avoid Running Containers as Root
Problem: Running containers as root increases the risk of privilege escalation.
Solution: Create and use a non-root user in your Dockerfile.
FROM alpine RUN addgroup -S nonroot \ && adduser -S nonroot -G nonroot USER nonroot ENTRYPOINT ["id"] Kubernetes
1. Don’t Run Privileged Pods
Problem: Running containers in privileged mode can reduce the resilience of a cluster in the event of a security incident because it weakens the isolation between hosts and containers.
Solution: Disable privileged mode in your pod specification.
apiVersion: v1 kind: Pod metadata: name: example spec: containers: - name: web image: nginx ports: - name: web containerPort: 80 protocol: TCP securityContext: privileged: false 2. Define Resource Requests and Limits
Problem: Allowing pods to use unlimited resources can destabilize the cluster.
Solution: Specify resource requests and limits for containers.
apiVersion: v1 kind: Pod metadata: name: resource-limited-pod spec: containers: - name: app image: myapp:latest resources: requests: memory: "256Mi" cpu: "0.5" limits: memory: "512Mi" cpu: "1" 3. Specific version tag for image should be used
Problem: When a container image is not tagged with a specific version, it is referred to as latest. This means that every time the image is built, deployed, or run, it will always use the latest version of the image.
Solution: To avoid these issues, it is recommended to use specific version tags for container images.
apiVersion: v1 kind: Pod metadata: name: example spec: containers: - name: nginx image: nginx:1.14.2 - name: nginx image: nginx@sha256:b0ad43f7ee5edbc0effbc14645ae7055e21bc1973aee5150745632a24a752661 Terraform
Allowing public network access to cloud resources is security-sensitive
Problem: Enabling public network access to cloud resources can affect an organization’s ability to protect its data or internal operations from data theft or disruption.
Solution: Use private networks and VPC peering or other secure communication tunnels to communicate with other cloud components.
resource "google_compute_instance" "example" { network_interface { network = google_compute_network.vpc_network_example.name } } Ansible
1. Server certificates should be verified
Problem: This vulnerability makes it possible for encrypted communication to be intercepted.
Solution: Ensure playbooks do not bypass certificate validation.
- name: Example playbook hosts: server tasks: - name: Retrieve a web page ansible.builtin.uri: url: https://www.example.com validate_certs: true return_content: true 2. Loose POSIX permissions
Problem: Files with overly permissive POSIX permissions (e.g., 777) grant unnecessary read, write, or execute access to unauthorized users.
Solution: Ensure playbooks explicitly set restrictive permissions on sensitive files.
- name: My deployment hosts: all tasks: - name: Create /etc/demo with permissions ansible.builtin.file: path: /etc/demo state: directory mode: '0770' - name: Copy demo3.conf and set symbolic permissions ansible.builtin.copy: src: /files/demo.conf dest: /etc/demo/demo.conf mode: 'g=r,u+w,o=' Secrets management
Keys and passwords being disclosed
Problem: Secret leaks often occur when sensitive authentication data is stored with an application's source code. This is one example (OVH keys, the most hit issue) but the issue extends to several platforms.
props.set("ovh-key", "EXEgWIz07P0HYwtQDs7cNIqCiQaWSuHF") Solution:
props.set("ovh-key", System.getenv("OVH_KEY")) Experimenting with Code GenAI and IaC
In order to see how accurate is Code GenAI for IaC artifacts, I've done an experiment with an AI application made in Java using Qarkus. You can find the source code here.
Using my IDE I've asked two different code assistants to produce the Kubernetes Deployment yaml file to deploy the application.
I've used Github Copilot and Amazon Q as code assistants.
Let's see the results.
Github Copilot
apiVersion: apps/v1 kind: Deployment metadata: name: pingpong-ai-deployment labels: app: pingpong-ai spec: replicas: 3 selector: matchLabels: app: pingpong-ai template: metadata: labels: app: pingpong-ai spec: containers: - name: pingpong-ai-container image: your-docker-repo/pingpong-ai:latest ports: - containerPort: 8080 env: - name: QUARKUS_PROFILE value: prod - name: JAVA_OPTS value: "-Dquarkus.http.host=0.0.0.0" volumeMounts: - name: config-volume mountPath: /deployments/config volumes: - name: config-volume configMap: name: pingpong-ai-config --- apiVersion: v1 kind: Service metadata: name: pingpong-ai-service spec: selector: app: pingpong-ai ports: - protocol: TCP port: 80 targetPort: 8080 type: LoadBalancer It has been smart enough to generate a Service and a volume. But, not all is shiny. SonarQube for IDE has detected 6 issues :
Amazon Q
apiVersion: apps/v1 kind: Deployment metadata: name: my-application labels: app: my-application spec: replicas: 3 # Number of pod replicas you want to run selector: matchLabels: app: my-application template: metadata: labels: app: my-application spec: containers: - name: my-application image: your-registry/your-image:tag # Replace with your image ports: - containerPort: 8080 # Replace with your application port resources: requests: cpu: "200m" memory: "256Mi" limits: cpu: "500m" memory: "512Mi" env: - name: ENVIRONMENT value: "production" # Add any environment variables your application needs livenessProbe: httpGet: path: /health # Replace with your health check endpoint port: 8080 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /health # Replace with your health check endpoint port: 8080 initialDelaySeconds: 5 periodSeconds: 10 This output is way cleaner and of higher quality than the previous one, and SonarQube for IDE has detected only 2 issues. However, it doesn't provide a Service or a volume for configuration files.
In both cases, it's been fast and without the problem of typing errors (wrong attribute name, wrong number of spaces, tabs, etc..)., but as with any code generated by AI, it needed a review phase to detect issues that are not always obvious in order to submit good-quality code to our repository and lead to smooth Pull Request reviews.
Conclusion
IaC enables teams to automate and scale infrastructure efficiently, but with great power comes great responsibility. Misconfigurations, hardcoded secrets, and overly permissive access controls are common mistakes that can lead to serious security vulnerabilities.
By following best practices and leveraging tools like SonarQube, developers can identify and resolve critical security issues early in the development process.
More than just security, maintaining code quality in IaC is essential. Well-structured, maintainable IaC ensures teams can quickly adapt to new requirements and maintain a robust, secure infrastructure.
Combining high-quality code with automated tooling is the key to avoiding costly security mishaps. SonarQube has rules to check all these issues in Azure Resource Manager, Docker, Kubernetes, CloudFormation, Terraform, Ansible and Secrets in general.



Top comments (0)