This month we have made OpsMx Enterprise for Spinnaker (OES) 3.3 version generally available to all our customers. OES 3.3 offers advanced features to help Release Managers, DevOps Engineers, and Site Reliability Engineers efficiently and securely operate Spinnaker. Following are the features released as part of OES 3.3 release:
- 360-degree visibility for informed decision making
- Continuous Delivery dashboard to measure and improve CD initiatives
Enhancement to Autopilot:
- Cluster Tagging for improved feedback process during the risk assessment
Enhancements to Spinnaker Lifecycle Management:
In addition to above new enterprise features we also released significant enhancements to the OpsMx Lifecycle Manager capability to ensure security, stability and availability of Spinnaker throughout its lifecycle.
- Secure service communication
- Secure API for webhooks
- 24*7 Monitoring of Spinnaker using Prometheus
- Automated Pipeline promotion process
360 degree Visibility for Informed Decision Making
Development teams in organizations continuously integrate and deliver their code using CD tools like Spinnaker, Argo CD, Jenkins, GoCD, etc. As a part of the release process, they configure approval gates in the pipeline before production deployment. At the approval stage, a release manager is expected to check if all the deployments’ criteria are met. Gathering and coordinating all the information from various teams- developers, testers, infra- takes hours and days. And we understand that the waiting period is not fun for release managers either, as it impacts the overall delivery time.
OES 3.3 introduces a Visibility feature that provides 360-degree information about deployments. OES integrates with GIT, JIRA, Autopilot, SonarQube, Jenkins, and many more. It includes information such as the author who committed the code, JIRA ticket id for the release, proper account ID for deployment, build status, code testing status, risk of release, etc. Instead of depending on various teams, release managers can log in to OES and decide whether to approve a pipeline execution for production deployment or reject it.
The best part is OES 3.3 acts as a unified system in your CD process by providing integrations to disparate systems in your CICD. The below images lists the number of tools OES can integrate with to wholesome information for approval gates.
Below images highlight data sources and information gathered (as per requirement of release managers) for effective decision making.
- Get accurate information about deployments and releases from different data sources at your fingertips.
- Eliminate time to manually gather data from various systems and teams – this enables you to reduce approval time from days to minutes.
- Consistently apply your decision making policies across all your pipelines, regardless of the underlying CI/CD tool
Continuous Delivery Dashboard to monitor and measure CICD initiatives
It is imperative to have metrics to understand the performance of CD transformation initiatives, understand status and trends of software delivery, and find areas of improvement for better ROI. In OES 3.3, we bring you a CD dashboard feature, highlighting the deployments in the last few days, weeks, or months.
With the CD Dashboard, you can now quickly fetch reports on time and status of deployments into Kubernetes or any cloud in the past week or month to get a better idea about overall project effectiveness . As of today, OES 3.3 provides pipeline information, including
- a list of active pipelines used frequently,
- # of times a channel has succeeded/failed,
- fast and slow-performing pipelines, and
- the overall time taken for manual judgments in various pipeline execution.
- Identify the pipeline execution problems, collaborate with your team, and take necessary actions to improve software delivery with metrics and insights into your deployments.
- Track progress of your CD transformation through the realtime visibility and dashboards.
Enhancement to Autopilot
Cluster Tagging for improved feedback process during the Risk assessment
In the continuous verification process, testers or developers or SREs perform a risk assessment of releases using our AI/ML tool called Autopilot. They provide feedback to the assessment process by categorizing logs to improve the skills of algorithms. To enhance future investigation convenience, we have introduced cluster tagging, which means when an SRE picks an error or an issue for analysis, he/she can give a tag (or comments) to the error logs. In future analyses, this tag will be propagated like user feedback. If Autopilot finds similar errors in the subsequent risk assessments, it will attach the same tag/comment. The cluster tagging feature provides a contextual understanding of logs in future risk assessments and saves investigation time.
- Save and propagate inference of current risks assessments of logs through tagging.
- Provide a contextual understanding of issues to your future risk analysis and save time in the investigation.
Enhancement to Spinnaker Lifecycle Management
In case you are using open source Spinnaker, you would agree that there are plenty of features that makes Spinnaker a great CD tool, but maintenance can be a bit knotty. OpsMx has added features to our Lifecycle Manager to enhance Spinnaker’s security, availability and performance aspects. For information on the initial release of OpsMx Lifecycle Manager (LCM), you can refer to the description of LCM features in OES 3.2. In our latest release, we have four exciting features in LCM.
Secure Spinnaker Service Communication using mTLS
Spinnaker has a lot of services (microservices). By default, the spinnaker services communicate with each other using the HTTP protocol, and there is no security layer for communication. Since communication between halyards and other services involves secrets and passwords, there is a high risk of secrets leaked.
So with this LCM feature, we make service communication an mTLS (mutual Transport Level Security) communication to enhance security. mTLS ensures all your data is encrypted and authenticates valid clients to interact with its services. For more, refer to How to set up mutual TLS authentication for Spinnaker in 7 easy steps.
Enablement of X.509 Authentication for Webhooks
MTLS can handle service to services communication, but how do you ensure all the communication (webhook calls) from humans or other technologies in your organizations to Spinnaker services are authenticated? Automating authentication against Spinnaker API is difficult. In LCM 1.2, we have provided a way to support X509 (public-key) certificates-based authentication. You can learn more about Authentication of Spinnaker Services using an x509 client certificate.
- Get production-grade security with enhanced authentication and secure communication of Spinnaker services.
Enablement of Prometheus for Monitoring of Spinnaker Services
Nobody likes to get services interrupted. Developers will at no cost enjoy their deployments getting disrupted; neither SREs can keep calm when Spinnaker services take time to restore. For a smooth Spinnaker operation, we provide Prometheus based monitoring and alert manager based notifications by default. Prometheus is configured to ensure timely alerts in case of an anomaly with the underlying tech stack supporting Spinnaker. E.g., the cache is too high; latency is above the threshold, Memory utilization is abnormal, etc.; Prometheus will alert all stakeholders to take corrective measures.
Below screenshot is an example of Prometheus notifying our SREs through slack about CPU-utilization is higher than 80% for a node. In such cases, our SREs teams would quickly take action before developers could start complaining about service deterioration. Cool, aint it?
- Ability to track metrics and be more proactive in resolving issues wrt Spinnaker instances and underlying tech stack.
- Avoid more than 90% unplanned service outages and increase the availability of Spinnaker.
Smooth updation of Pipeline Promotion to Production
Manytimes applications in Spinnaker may require GitOps model for SOC compliance, particularly for production delivery applications. This may also be implemented as a different Spinnaker instance, for eg Production Spinnaker. All changes to Production Spinnaker including deployment configuration and Pipeline configuration are promoted based on Git version control.
Spinnaker pipelines are json structures that can be stored in Git and can be applied to Spinnaker. However, Pipeline modifications are fairly complex and large – making editing by hand a difficult task and erroneous too.
With this LCM feature we provide the convenience where one can edit the pipeline in Spinnaker GUI and save it in the Git to realize the pipeline with appropriate parameters in different environments. However from OES 3.3, pipeline can be stored in a central repository Vault or S3 or Git and whenever it is required can be accessed to create a new pipeline for prod environments with new secrets of that environment.
- Avoid manual and erroneous promotion of pipelines from staging to production.
- Use GitOps style mechanism and best secret management to instantaneously store, create, and modify pipelines in your chosen environment.
If you want to know more about this release, please reach out to us at [email protected].
OpsMx is a leading provider of Continuous Delivery solutions that help enterprises safely deliver software at scale and without any human intervention. We help engineering teams take the risk and manual effort out of releasing innovations at the speed of modern business. For additional information, contact us.