- Notifications
You must be signed in to change notification settings - Fork 635
Use Kuberentes Resource Format in CRD + Resource Specification Cleanup #1391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9305f09 to 148c728 Compare Prior to this commit, one could specify the resources that the instances in a PostgreSQL cluster could use by putting in a Kubernetes-format string into the pgcluster CRD (as well as the pgreplica CRD) in a custom format that requested both the Kubernetes Request and Limit for Memory and CPU. The original impetus for this change came from some research that showed setting the Limit for Memory would increase the risk that a PostgreSQL clusters could enter into some less-than-desirable situations where the OOM killer could take out a PostgreSQL backend and cause every other connection backend to fail as the PostgreSQL server enters crash recovery mode. While the Guaranteed QoS in Kubernetes, which requires both Memory & CPU Limits and Requests to be set, would set an overall OOM score adjustment to be very low[1] (albeit not the PostgreSQL recommended score of -1000[2]), this does not affect behavior of Limit[3] and still puts the PostgreSQL connection backends and risk of the OOM killer and aforementioned behavior. As such, this change began with removing the option to set the "Limit" resources in the pgcluster CRD, which afforded the opportunity to switch the "Resource" requests to match the the container resource data type for Kubernetes[4]. However, this was extend to do a few more things to bring the PostgreSQL Operator inline with best practices for running PostgreSQL on Kubernetes, including: - Removing the "ContainerResources" attribute from the pgreplica CRD. This ensures all failoverable instances have the same resource definitions. Future work will allow for more customization around this (though with strong caveats on when to set a replica with mismatched resources from the primary). You can achieve similar behavior now with the clone and standby functionality - Removing the "Default...ContainerResources" attributes from the PostgreSQL Operator configuration. This provides more flexibility for how these Pods are deployed and managed by Kubernetes. These include: - "PostgreSQL containers" - Load Job - Backup Jobs - "rmdata" Jobs - pgBadger - pgBouncer Future work will allow some of these Pods to have their resources set by the PostgreSQL Operator. - A small modification to how CPU and Memory are displayed by `pgo show cluster`, though likely it will go unnoticed Issue: [ch7766] [1] https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#node-oom-behavior [2] https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT [3] https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#how-pods-with-resource-limits-are-run [4] https://pkg.go.dev/k8s.io/api/core/v1?tab=doc#ResourceList
This was not being used anywhere.
cbandy left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this very much. The stronger upstream types are a huge improvement and were more work than I expected. The upstream YAML parser is good, too, but it means our field tags need to change.
So much cleaned up! 👍
operator/cluster/pgbouncer.go Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see Spec.ClusterName is still used quite a lot. Is this a cleanup that I should also do when I see it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbandy Yup...so cluster.Name invokes cluster.Spec.ClusterName which is the name of the Cluster (vs. the individual instances).
@andrewlecuyer knows the bits that set that, I'm having trouble finding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClusterName in the Spec has never really made sense to me since the ObjectMeta already has the name.
So from that perspective I'd def rather just start using cluster.Name (i.e. the ObjectMeta name) across the board, and hopefully eventually remove ClusterName from the spec. There are a lot of them to be cleaned up, but seems to make sense to start tackling them as we come across them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is any resource.Quantity instance valid? If so, changing the type of request.BackrestCPURequest to *resource.Quantity could eliminate this call.
If it's difficult to change msgs.CreateClusterRequest, don't worry about it. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could try it -- curious what happens if an invalid value is passed in when it's doing the Marshaling/Unmarshaling 🤔 Not sure if I do want to test it now
There are cases where it may be helpful to control the container resource utilization for the pgBouncer Deployments. As such, this modifies the pgcluster CRD to store the resource values pertaining to the pgBouncer Deployments. This also adds new flags to the PostgreSQL Operator client to manage the CPU / memory resources, including: - `pgo create cluster --pgbouncer-cpu` - `pgo create cluster --pgbouncer-memory` - `pgo create pgbouncer --cpu` - `pgo create pgbouncer --memory` - `pgo update pgbouncer --cpu` - `pgo update pgbouncer --memory` This also fixes a bug with rotating the pgbouncer system account password, where all pgBouncer Pods would be checked for propagation vs. just the pgBouncer Pods for the specific cluster. Also there are some general code cleanups.
There are cases where it may be helpful to control the container resource utilization for the pgBackRest repository Deployment that comes with every PostgreSQL cluster that is managed by the PostgreSQL Operato . This modifies the pgcluster CRD to store the resource values pertaining to the pgBackRest repository. This also adds new flags to the PostgreSQL Operator client to manage the CPU / memory resources, including: - `pgo create cluster --pgbackrest-cpu` - `pgo create cluster --pgbackrest-memory` - `pgo update cluster --pgbackrest-cpu` - `pgo update cluster --pgbackrest-memory` This also fixes a bug with rotating the pgbouncer system account password, where all pgBouncer Pods would be checked for propagation vs. just the pgBouncer Pods for the specific cluster. Also there are some general code cleanups.
This is now replaced with the one-off settings that are available in the PostgreSQL Operator. It is still possible to achieve similar behavior by modifying the various Job / Pod templates that are deployed in the "pgo-config" ConfigMap. However, this helps the Operator to do things in a more Kubernetes standard way.
This was no longer used.
Checklist:
Type of Changes:
What is the current behavior? (link to any open issues here)
Prior to this commit, one could specify the resources that the instances
in a PostgreSQL cluster could use by putting in a Kubernetes-format
string into the pgcluster CRD (as well as the pgreplica CRD) in a custom
format that requested both the Kubernetes Request and Limit for Memory
and CPU.
The original impetus for this change came from some research that showed
setting the Limit for Memory would increase the risk that a PostgreSQL
clusters could enter into some less-than-desirable situations where the
OOM killer could take out a PostgreSQL backend and cause every other
connection backend to fail as the PostgreSQL server enters crash
recovery mode. While the Guaranteed QoS in Kubernetes, which requires
both Memory & CPU Limits and Requests to be set, would set an overall
OOM score adjustment to be very low (albeit not the PostgreSQL
recommended score of -1000), this does not affect the behavior of
Limit and still puts the PostgreSQL connection backends and risk of
the OOM killer and aforementioned behavior.
What is the new behavior (if this is a feature change)?
As such, this change began with removing the option to set the "Limit"
resources in the pgcluster CRD, which afforded the opportunity to switch
the "Resource" requests to match the the container resource data type
for Kubernetes. However, this was extend to do a few more things to
bring the PostgreSQL Operator inline with best practices for running
PostgreSQL on Kubernetes, including:
Removing the "ContainerResources" attribute from the pgreplica CRD.
This ensures all failoverable instances have the same resource
definitions. Future work will allow for more customization around this
(though with strong caveats on when to set a replica with mismatched
resources from the primary). You can achieve similar behavior now with
the clone and standby functionality
Removing the "Default...ContainerResources" attributes from the
PostgreSQL Operator configuration. This provides more flexibility for
how these Pods are deployed and managed by Kubernetes. These include:
Future work will allow some of these Pods to have their resources set by
the PostgreSQL Operator.
pgo show cluster, though likely it will go unnoticedOther information:
Issue: [ch7766]