Share

Argo Rollouts solve a prevalent risk management problem in deploying to production environments in rolling out new changes by controlling blast radius and automated recovery with rollback on failure. Argo rollouts support multiple strategies; we will discuss configurations and factors to consider for selecting one strategy over the other. 

Argo Rollout can be used along with ArgoCD and independent of ArgoCD. Most existing Kubernetes users use Deployment manifests for deploying to Kubernetes and use Helm templating or Kustomize in their packaging. Argo Rollouts can be adopted to configure on top of the existing packaging or Deployment manifests.

The scope of this discussion is limited to the deployment strategies for a single cluster. We will discuss methods for Deployment across multiple clusters and rollouts in multiple service environments in a separate blog. 

Rollout Configuration

ArgoCD is designed to synchronize Git configuration with Kubernetes deployment configuration. Progressive delivery requires a workflow in moving from one revision to another with steps to support analysis at each step to determine moving forward with new revision or to rollback. Argo Rollouts provide the workflow with declarative configuration while disabling the ArgoCD synchronization during the rollout workflow.

Argo Rollout can be applied to a single deployment object or a replicaset in an application. Most of the existing packaging for Kubernetes applications are in the form of Deployment manifests and have to be changed if the manifest is of the type Replicaset. We typically see the packaging team is different from the SRE team that decides on the deployment strategy in their environment. This causes issues in the maintenance of the manifests by two different teams. Using Deployment kind manifests is simpler and referencing them with the Rollout strategy. This works well with templating tools as well. 

Rollout is a Custom Resource Definition applied to the Kubernetes cluster. A Rollout spec for an existing application can simply reference an existing deployment and specify the strategy to apply. In the following configuration, the rollout refers to the existing deployment manifest “rollout-ref-deployment” and applies a rollout strategy to replace the deployment strategy from the deployment manifest.

				
					kind: Rollout
spec:
  workloadRef: 
    apiVersion: apps/v1
    kind: Deployment
    name: rollout-ref-deployment

				
			

Fig 1: Rollout Deployment configuration reference

Rollouts for multiple manifests in an application are not supported at this time. Even when deploying multiple services, Rollout is applied to one manifest spec, and rollback strategies need to be applied by the user. We will discuss the strategies in another blog coming soon. 

How does Argo Rollout work?

Kubernetes deployment object supports strategies of replacing or rolling updates for the new revision being applied to the application namespace. Rollout’s success is based on pods’ readiness and liveness probes. 

Argo Rollout works in conjunction with Kubernetes deployment or replicaset and provides additional strategies of blue-green and canary for rolling out new revisions. In addition, the success of the rollout steps can be verified by external stimuli along with readiness and liveness probes. 

In addition to blue-green and canary strategies, Rollout spec supports incremental traffic shaping through ingress controllers or changing the number of revision instances to manage risk in deploying new revision. Analysis of the new revision can be manual or automated with automated rollback on the failure of verification, which can be Automated verification.

An Argo rollout works by adding rollout specification to replicaset object spec or replacing deployment object strategy spec with a Rollout strategy spec. When using a replicaset specification, the Rollout manifest specifies the rolling-out strategy. When using a deployment specification, since the rollout strategy replaces the strategy specified by the deployment manifest, the Rollout object uses a reference to the deployment manifest with a replacement Rollout strategy spec. This allows Rollout to use the Deployment spec of the manifest while ignoring the strategy spec from the Deployment manifest and replacing it with the strategy from Rollout spec. The following configuration shows the key strategy for blue-green or canary strategies.

				
					kind: Rollout
spec:
strategy:
    # Blue-green update strategy
    blueGreen:
      # Reference to service that the rollout modifies as the active service.
      activeService: active-service
      previewService: preview-service
   canary:
      canaryService: canary-service
      stableService: stable-service
     steps:

      # Sets the ratio of canary ReplicaSet to 30%
      - setWeight: 30
      # Pauses indefinitely until manually resumed
      - pause: {}

				
			

Figure 2: Rollout Strategy specification

When to use Blue/Green

Argo Rollouts with Blue/Green strategy provides options in controlling the rollout steps. At a basic level, Argo Rollout blue/green strategy implements the following steps:

  1. Bring up replicaset with new revision (optionally specify a smaller number of pods than the active instance)
  2. After the instances are healthy, optionally update the preview service with the selector to route traffic to the new revision using Rollout-pod-template-hash
  3. Wait for user input or autoPromotionSeconds to update active service with Rollout-pod-template-hash of new revision and remove selector for old revision. If the number of pods from step 1 is smaller than the active revision, the pod count is updated to match the active pod count before updating the active service selector
  4. Scale down old revision pods after scaleDownDelaySeconds
Blue Green traffic routing
Figure 3: Blue Green traffic routing

Blue-green takes more resources as there are two revisions running at the same time. If the startup time for the pods is high, then the rollback time when the new revision fails is high. Using a blue/green strategy when startup time is high reduces the rollback time. Additionally, if the rollout needs testing in production without production traffic, then the blue/green strategy allows exposing new revisions to test while allowing production traffic to continue using the existing active revision. 

Additionally, if an application has multiple dependent pods, the deployment may also include dependent component changes. Typically, the revisions are expected to be forward and backward compatible with their interfaces. In this case, using blue/green, one can verify the functionality in production without exposing the production traffic to the new revision.

When to use Canary

Canary strategy works by creating new revision pods and routing production traffic to new revision along with the current revision allowing exposure of new revision to partial set of requests for analysis before promoting the new revision to full production. Argo Rollouts support traffic shaping configuration with Istio to allow for changing the number of replicas independent of the traffic routed to the revisions. Figure 4 shows the concept of updating the number of replicas for canary instances. Increasing the number of Canary instances to 3 and reducing the number instance active revision to 0 completes the canary process.  

Canary deployment strategy with Argo Rollout
Figure 4: Canary deployment strategy with Argo Rollout

When there is significant traffic in production, a canary can reduce the risk by reducing the blast radius. Canary strategy can be used when there are many pods in production and using blue/green causes significant costs for Rollouts. By sending partial traffic to the new revision, the configuration in production for the new revision can be verified with the production traffic. Once the configuration is verified with a new revision, the rollout can service all the production traffic by scaling up the new revision and scaling down the old revision.

The older revisions are cleaned based on the revisionHistoryLimit configured in the deployment specification.

Using temporary annotations to simplify monitoring and Analysis

Argo Rollouts add Rollout-pod-template-hash to the pods for use in the service selectors. This label can also be used in monitoring queries for analyzing the health of revisions deployed. However, this causes problems as the labels are only available after deployment, and additional introspection is needed to identify hash values. Argo Rollout supports temporary labels to be added to the revisions to simplify the monitoring dashboards or to be used in automated analysis during the rollout process.

				
					spec:
  strategy:
    blueGreen:
      activeMetadata:
        labels:
          role: active
      previewMetadata:
        labels:
          role: preview

				
			

Figure 5: Specifying template for automated analysis

This is highly recommended to simplify monitoring dashboards as well as to support automated analysis during deployments. 

Automated Verification

Argo Rollout steps can use automated verification for new revisions to make decisions on progressing the deployment. A detailed description of how to use automated verification can be found at  Automated verification.

				
					kind: Rollout
spec:
strategy:
   canary:
      canaryService: canary-service
      stableService: stable-service
     analysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: restapp-service
        - name: stable-hash
          valueFrom:
            podTemplateHashValue: Stable
        - name: canary-hash
          valueFrom:
            podTemplateHashValue: Canary

				
			

Conclusion

Argo Rollouts provide a powerful deployment strategy that is easy to implement in application rollouts. The strategies to support multiple service deployments and dependent service deployments can be addressed with best practices and dependency management strategies.

0 Comments

Submit a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.