Large-scale deployments to complex applications at high-velocity increase failure rates. Longer pipelines and scattered microservices are harder to monitor, and small mistakes can cost the organization considerable time.
The blog intends to simplify the automated verification of deployments using Argo Rollouts and OpsMx’s Automated Verification System, which is backed by many popular monitoring service integrations like New Relic ElasticSearch and many more. The integrated system of these components will minimize the risk factors and help you feel more confident in shipping complex applications to Kubernetes.
What is Argo?
Argo is a collection of open-source tools for Kubernetes to run workflows, manage clusters, and do GitOps right. It provides various platforms like Argo Workflows, ArgoCD and Argo Rollouts which can be used standalone or in an integrated manner, depending on the needs of the deployment process.
Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes. Argo Rollouts is a Kubernetes controller and set of CRDs which provide advanced deployment capabilities such as blue-green, canary, canary analysis, experimentation, and progressive delivery features to Kubernetes.
What is OpsMx Metric Provider?
Opsmx Metric Provider is a proposed extension built over the Argo-rollouts service, which acts just like any other metric provider from the point of configuration, manifest creation or execution. It communicates with ISD (Opsmx’s Intelligent Software Delivery Platform) for the execution of analysis flow. The differences it brings into the scenario are:
- It is capable of running analysis based on a large number of metrics and logs collected from various monitoring data providers.
- It returns the results in the form of a score which is an overall evaluation of the monitoring data collected.
- The metric provider communicates with Opsmx’s Saas/on-prem services to generate the results.
- The metric provider is backed by a large set of data providers like New Relic, Dynatrace, Appdynamics, ElasticSearch, GrayLog, which can collaborate to produce one generalized result, making it easier to judge the quality of deployment.
With the help of the capabilities mentioned above, it reduces the effort in configuring metrics as evaluation mechanisms in Argo Analysis Template along with complicated success conditions while working with many evaluation parameters.
Application and ISD Template Setup
For this blog, we’ll use a basic test application named ‘issuegen’, which simulates various normal to erroneous transactional conditions.
Register Data Providers in ISD
ISD supports a long list of data providers. The first step is to register the endpoints from where data to be evaluated will be collected during analysis. E.g. ElasticSearch for Log Data Collection is to be registered as shown below:
Similarly, multiple data providers can be configured to participate in a single evaluation.
We will use ElasticSearch for Log Data Collection and Prometheus for Metric Data Collection in this example.
Configure Application and Verification Gate in ISD
ISD has concepts of hierarchical deployment to be able to verify multiple services associated with an application. A verification gate is associated with each application service, backed by generalized metric and log collection templates that intelligently collect data from data sources based on filters given by the user.
In addition to out-of-box generalized templates for data collection, there is also a provision for creating custom templates with custom thresholds to provide extensibility to the usage of the product.
The above figure shows an application with a single service and the service attached to a verification gate. Upon selecting the gate, templates can be created and selected for analysis. The following example shows a log template structure which lets users select the data source and filter keys to get the required data.
Similar to the above scenario, metric templates for data providers like Prometheus can be configured to either collect a pre-defined list of metrics or user-defined metrics in case of custom templates.
The filter keys in the templates can hold some arbitrary values to begin with and can be overwritten while defining the Analysis Template in Argo Rollout.
Setup a Rollout
Structure of Rollout
In this example, a canary deployment which follows a progressive delivery with analysis after each progression is created. For this example, the service has been exposed using ingress.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: oes-argo-rollout
spec:
replicas: 4
revisionHistoryLimit: 2
selector:
matchLabels:
app: testapp-rollout
template:
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus_io_path: '/mgmt/prometheus'
prometheus_io_port: '8088'
labels:
app: testapp-rollout
spec:
containers:
- name: rollouts-baseline
image: docker.io/opsmxdev/issuegen:v3.0.1
imagePullPolicy: Always
ports:
- containerPort: 8088
strategy:
canary:
steps:
- setWeight: 25
- pause: { duration: 60s }
- analysis:
templates:
- templateName: oes-analysis-for-canary
args:
- name: canary-hash
valueFrom:
podTemplateHashValue: Latest
- name: baseline-hash
valueFrom:
podTemplateHashValue: Stable
- setWeight: 50
- pause: { duration: 60s }
- analysis:
templates:
- templateName: oes-analysis-for-canary
args:
- name: canary-hash
valueFrom:
podTemplateHashValue: Latest
- name: baseline-hash
valueFrom:
podTemplateHashValue: Stable
- setWeight: 100
The analysis template is configured as given below:
kind: AnalysisTemplate
apiVersion: argoproj.io/v1alpha1
metadata:
name: oes-analysis-for-canary
spec:
args:
- name: canary-hash
- name: baseline-hash
metrics:
- name: oes-analysis-for-canary
count: 1
initialDelay: 30s
provider:
opsmx:
gateUrl: https://ds312.isd-dev.opsmx.net/
application: demoappforcanary
user: admin
lifetimeHours: "0.1"
threshold:
pass: 80
marginal: 60
services:
- serviceName: demoservice
gateName: demo-gate
logScopeVariables: "kubernetes.pod_name"
baselineLogScope: "oes-argo-rollout-{{args.baseline-hash}}.*"
canaryLogScope: "oes-argo-rollout-{{args.canary-hash}}.*"
metricScopeVariables: "namespace_key,pod_key"
baselineMetricScope: "argocd,oes-argo-rollout-{{args.baseline-hash}}.*"
canaryMetricScope: "argocd,oes-argo-rollout-{{args.canary-hash}}.*"
Structure of Analysis Template with OpsMx Provider
The provider ‘OpsMx’ has a few mandatory arguments like application, gateUrl, user and threshold, which control analysis execution. The rest of the arguments provide the ability to customize at runtime.
The lifetimeHours field supports the length of analysis. E.g. If the lifetimeHours is “0.1”, the analysis will start from the time of execution and will last for 6 minutes from then onwards.
The services segment can carry a list of multiple services to be analyzed for logs, metrics or logs and metrics. This section is not compulsory, and upon not providing this section, all the services configured in ISD under an application will be analyzed. This section can be given to overwrite the service and filter configurations given in ISD.
Log filters are handled by logScopeVariables, baselineLogScope and canaryLogScope.
Similarly, metric filters are handled by metricScopeVariables, baselineMetricScope and canaryMetricScope.
Deployment and Analysis
Create an application in Argo CD
In case the user has integrated Argo Rollouts with Argo CD, then the following procedure can be followed. But the metric provider will work in a similar fashion irrespective of the rollout being invoked from Argo CD or Argo Rollouts CLI.
Before the canary phase begins, the application looks like the given below.
To initiate the rollout, the image of the application is changed to a later version.
Version1
spec:
containers:
- name: rollouts-baseline
image: docker.io/opsmxdev/issuegen:v3.0.1
imagePullPolicy: Always
Version2
spec:
containers:
- name: rollouts-baseline
image: docker.io/opsmxdev/issuegen:v3.0.2
imagePullPolicy: Always
Upon updating the image in the Rollout manifest, the application status becomes ‘OutOfSync’.
Triggering Rollout
Triggering the sync operation will initiate the rollout strategy steps in a progressive manner.
As the analysisRun begins, the Report Url is published in analysisRun Details.
For each of the analyses run, a report of following format is published in ISD which can be accessed using the Report Url link
Rollback
In the following case, the analysis fails and the health of application is marked as Degraded.
Upon clicking the reportUrl, the ISD analysis report can be seen. As shown in the report, some unexpected critical error logs have been recorded during the analysis. Also, one of the health metrics has also failed causing the score to be below the passing score. Thus, the rollback is initiated.
A lot of further details on logs can be found in the report like Time Analysis of a specific error, frequency of occurrence etc. Also, reclassification of logs can be done to empower machine learning techniques running behind the scenes.
Conclusion
Combining Argo Project with OpsMx’s Intelligent Software Delivery Platform (ISD) is a combination that results in a very strong and capable evaluation of deployments. OpsMx ISD for Argo allows developers and Ops to deploy into Kubernetes clusters at scale while efficiently operating and managing the lifecycle of Argo. The platform delivers the essential services- visibility and control, delivery intelligence, deployment dashboard and audit- to ensure day-2 operations in software delivery have the consistency and reliability you expect at any scale.
OpsMx also provides Argo Center of Excellence (COE) for enterprises needing expert services or consultation on Argo CD and Argo Rollout implementation to expedite their GitOps journey.
About OpsMx
Founded with the vision of “delivering software without human intervention,” OpsMx enables customers to transform and automate their software delivery processes. OpsMx builds on open-source Spinnaker and Argo with services and software that helps DevOps teams SHIP BETTER SOFTWARE FASTER.
0 Comments