OpsMx logo

HTTPS support included for Spinnaker monitoring daemon

Background: Spinnaker-monitoring daemon collects metrics exposed and reported via a built-in endpoint in each spinnaker microservice instance. Refer to all the spinnaker microservices below.

Spinnaker Architecture
Spinnaker Architecture

Integration with a third-party event monitoring and alerting system like Prometheus allows it to record these real-time metrics in a time series database using polling (HTTP pull model). You can then use it to view dashboards, receive alerts, and informally browse depending on your needs. Learn more about how to set up monitoring spinnaker microservices. 

Currently, enterprises need a more secure communication protocol ‘HTTPSinstead of ‘HTTP’ to communicate between services, even within the same Kubernetes cluster. HTTPS ensures critical security and data integrity of communication between different applications, and no intruders can exploit your data from the exchange. (Know more why HTTPS?)

However, enabling metrics in spinnaker creates a sidecar container that calls the metrics endpoint of spinnaker service. This means the default container is unable to communicate in HTTPS.

In this blog, we will provide insights on how to enable ‘HTTPS’ so that the Spinnaker monitoring daemon can collect information from spinnaker services through the secured protocol.  

Configuring ‘https’ may throw an errors like below:

(Version reference: SPINNAKER: 1.20.5, Halyard 1.38, Kubernetes v1.17.9-eks-4c6976)

Issue: Error in the monitoring daemon logs

kubectl -n spinnaker logs -f spin-gate-84958fbb6c-9njj5 -c monitoring-daemon

13:44:21 ERROR gate failed https://localhost:8084/spectator/metrics with <urlopen error [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:727)>

13:44:21 DEBUG Collection times 2 (ms): {‘gate’: 2}

13:44:21 DEBUG Wrote 0 metrics to PrometheusMetricsService in 3 ms + 6 ms

Solution: Our engineers at OpsMx, worked out a solution for resolving this error. The following changes were introduced:

  • Change python code to supply client certificates to spinnaker containers 
  • In github repo the following changes were done to the file:
    • https://github.com/spinnaker/spinnaker-monitoring/blob/master/spinnaker-monitoring-daemon/spinnaker-monitoring/spectator_client.py
  • After opening this file, line 515 was replaced with:

    context = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH, cafile=”/pkcs12/ca.crt”)

    context.load_cert_chain(certfile=”/pkcs12/tls.crt”, keyfile=”/pkcs12/tls.key”)

    response = urllibUrlopen(self.create_request(url, authorization), context=context)

  • Also, import commands were added for SSL.
  • The image for monitoring-daemon should be opsmxdev/spinmon:wtls which has the above changes and then add volume mount to the monitoring daemon container for all monitored pods.

– mountPath: /pkcs12

  name: mtlscerts-pkcs12

STEPS: Configure the monitoring daemon to connect with tls, exec into the halyard pod ( e.g kubectl -n spin exec -it halyardpod-0 bash ), and make the following edits to the files in the paths as mentioned below:

  1. Need to add ‘scheme: https’  to files in /home/spinnaker/.hal/default/service-settings, except deck, monitoring-daemon ( otherwise the monitoring daemon will call http:// localhost instead of https://localhost)
  2. Change client-auth to ‘need’ (in place of want ) in all files in /home/spinnaker/.hal/default/profile, except deck  and gate( tightening the security of the containers)
  3. /home/spinnaker/.hal/default/service-settings/monitoring-daemon.yml needs to be created (to use the custom docker image and mount the certificates into the monitoring-daemon pod).

kubernetes:

  volumes:

  – id: mtlscerts-pkcs12

    mountPath: /pkcs12

    type: secret

    readOnly: true

artifactId: opsmxdev/spinmon:wtls

Conclusion: By making the above changes in configuration, the Spinnaker-monitoring daemon seamlessly integrates with third-party monitoring tools using a secure protocol. You can now learn the anomalies in metrics data, and track genuine errors and exceptions in your application logs in minutes, and rollout new releases with confidence. With all services communicating in tls, we have a more secure spinnaker instance, with monitoring enabled, we have a measurable method of evaluating the performance of the various services.

About OpsMx

OpsMx is a leading provider of Continuous Delivery solutions that help Fortune 500 companies safely deliver software at scale and without human intervention. We help engineering teams take the risk and manual effort out of releasing innovations at the speed of modern business. For additional information, contact us.

References:

https://docs.python.org/2/library/ssl.html

https://www.programcreek.com/python/example/71559/ssl.create_default_context

Leave a Comment

Your email address will not be published.

You may like