Problem Introduction:
Kubernetes does a good job in self healing and application recovery from failure. New pods come up in the place of pods that crash. One reason for pods failing in kubernetes cluster is the memory consumed exceeding the limit set. In this case, kubernetes pods are OOM killed ( out of memory) and there is a temporary outage before the new pods come up.
Alert manager can be used to monitor the memory usage of pods, when the pod usage gets close to the limit then an alert can be triggered. Usually these alerts are emails or slack message to engineers who are then expected to fix the pods limits or take any other appropriate action.
If the action is just increasing the limit for the pods memory then it can be automated by using the webhook alerts feature of alert manager and webhook triggered pipelines of spinnaker.
Solution introduction:
In this blog, https://mallozup.github.io/posts/self-healing-systems-with-prometheus/, DARIO MAIOCCHI shows how alermanager’s webhooks can be used to trigger an external application.
In this documentation https://prometheus.io/docs/alerting/latest/configuration/#webhook_config more info about prometheus alertmanager webhook trigger is present.
Here is the documentation to trigger a spinnaker pipeline from external sources https://spinnaker.io/docs/guides/user/pipeline/triggers/webhooks/.
So one can trigger a spinnaker pipeline from alert manager and patch the pods manifest using the in-built patch stage seen here https://spinnaker.io/docs/guides/user/kubernetes-v2/patch-manifest/.
Prerequisites:
Spinnaker installed, prometheus and alertmanager installed to monitor pods installed in the cluster.
Details:
A deployment using a pod consuming a constant memory (150Mi) and a limit configured (200Mi) is used as the candidate to monitor.
The yaml for the pod can be found at https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/
Configure the prometheus configmap for alerts with the following code to monitor the above created pod/deployment
- name: feedback-container-menmory-too-high
rules:
- alert: feedback-container-menmory-too-high
annotations:
description: container memory-demo-ctr in namespace jobs in isdprod is taking too much memory
may be evicted soon
summary: memory-demo-ctr in namespace jobs in isdprod is taking too much memory
expr: (sum(container_memory_max_usage_bytes{container="memory-demo-ctr"}) by (instance, area) / sum(container_spec_memory_limit_bytes{container="memory-demo-ctr"}) by (instance, area)) > .75
for: 8m
labels:
severity: critical
Then configure the receiver in the alermanager configmap using the code below
- name: feedback-receiver
webhook_configs:
- url: "https:///webhooks/webhook/alerthandler"
http_config:
basic_auth:
username: ""
password: ""
Finally set the alerts to be sent to the receiver in the alertmanager configmap
- match:
alertname: feedback-container-menmory-too-high
repeat_interval: 4m
group_interval: 4m
receiver: feedback-receiver
Now prometheus and alert manager are ready. when the ratio of max memory usage to the memory limit exceeds .75 for about 8 minutes, then an alert is sent out to the spinnaker webhook endpoint.
Now to get spinnaker ready to receive the webhook , create a pipeline and in the configuration stage choose webhook trigger and use the same endpoint given to alertmanager, alerthandler, in this case.
Then create a patch resource stage to increase the memory limit to (225Mi). The yaml needed can be as below
spec:
template:
spec:
containers:
- name: memory-demo-ctr
resources:
limits:
memory: 225Mi
Now watch as the alert manager triggers the pipeline to increase the memory limit and stabilize the pod.
Conclusion:
A simple proof of concept is presented here that shows how webhooks can be used to connect alertmanager and spinnaker to stabilize pod memory usage and “heal” the pod before any possible Out of memory errors happen and pods get evicted.
Part Two: of this series will published soon with both memory and cpu tuning. Also for multiple pod replicas and using vertical pod autoscaler recommendation to tune the requests and limits od pod resources.
Future improvements:
The time to monitor the pod memory may vary from application to application and has to be tuned accordingly. The amount of extra memory to be allotted as pod memory limit is not simple to calculate and has to be tuned according to application requirements.
Also the kubernetes node memory has to be included in the equation. Number of replicas of pods will also affect the tuning of this methodology before practical application of this process.
Acknowledgements:
I thank Sharief Shaik and Srinivas Kambhampati for their inputs.
About OpsMx
Founded with the vision of “delivering software without human intervention,” OpsMx enables customers to transform and automate their software delivery processes. OpsMx builds on open-source Spinnaker and Argo with services and software that helps DevOps teams SHIP BETTER SOFTWARE FASTER.
0 Comments