Automated Canary Deployment with Post-Deployment Verification on GCP CloudRun using Google Cloud Deploy for continuous delivery

Varsha Kashyap
Searce
Published in
11 min readMay 3, 2023

--

Hello everyone 👋 , In our previous blog post, we discussed the Google Cloud Deploy service and provided a guide on configuring multiple targets in a pipeline for Cloud Run services.

In this post, we will explore the Canary Deployment strategies offered by Google Cloud Deploy, including their strengths, limitations, and optimal use cases. Our aim is to showcase how these strategies can be utilised for environment-based testing (Dev/Prod) and Canary deployment patterns, while also presenting a comprehensive framework for multi-stage Canary deployment on the Google Cloud Platform.

What is Canary Deployment?

A canary deployment gives you a chance to partially release your application. In this way, you can ensure the new version of your application is reliable before you deliver it to all users. So basically canary deployment is a progressive rollout of an application that splits traffic between an already-deployed version and a new version, rolling it out to a subset of users before rolling out fully.

Delivery pipeline with canary deployment progress
Delivery Pipeline with Canary Deployment Progress

Canary Release

This deployment strategy is extensively employed in numerous use cases due to the wide range of advantages it provides to developers. In the following sections, we will highlight the benefits and limitations of this strategy. Below are the benefits it offers and its shortcomings.

Best Use Cases

For companies that release major updates at specific intervals, having a testing platform that can handle a variety of tests in a production environment is crucial. This enables developers to identify and resolve any issues that may negatively impact users before the updates are released. Canary deployment is a highly effective strategy for frequent rollouts, as it does not require costly infrastructure and can be implemented seamlessly. It is also ideal for testing new features before deciding whether to deploy them to all users. The ability to target a subset of users based on specific criteria allows for the collection of more meaningful data, leading to deeper insights.

Deployment strategy in Google Cloud Deploy

Google Cloud Deploy supports the following deployment strategies:

  1. Standard deployment
    The standard deployment strategy involves deploying an application to one or more target runtimes without any progressive rollout or splitting between the old and new versions of the application. This deployment approach offers the ability to perform rollbacks, verify the deployment, and deploy to multiple targets simultaneously.
  2. Canary deployment (preview)
    Canary deployment is a gradual rollout strategy where an application is first deployed to a portion of the infrastructure, allowing for thorough testing before proceeding with the wider rollout. For instance, a 50% canary deployment to Cloud Run directs half of the traffic to the new revision and the other half to the old revision. Upon successful testing, the rollout progresses to 100%. Google Cloud Deploy offers the flexibility to specify any progression percentage except for partial percentages such as 20.5%

Both the standard deployment strategy and the canary deployment strategy are compatible with all runtime environments supported by Google Cloud Deploy. Additionally, both deployment strategies support rollbacks, canceling rollouts, and parallel deployment, which involves deploying to multiple targets at the same time.

  • To implement above mentioned strategy alongside a standard deployment strategy, we could deploy the new code to the Canary group first in an incremental fashion.
  • By using a combination of standard and Canary deployment strategies (as shown in below file), we can ensure that our applications are updated and improved in a controlled and efficient manner, while minimizing the risk of any potential issues. It is important to choose the appropriate deployment strategy for each situation, depending on factors such as the size and complexity of the application, the level of risk involved, and the desired level of control and flexibility.

NOTE: When the deployment strategy includes the parameter verify: true, the subsequent post deployment verify step given below will be executed, whereas if the parameter is absent or set to false, the post-deployment verification process will be skipped.

Google Cloud Deploy Post Deployment Verification

Google Cloud Deploy is introducing a new feature called deployment verification, with this feature developers and operators will be able to orchestrate and execute post deployment testing without having to undertake a more extensive testing integration.

Here, Deployment Verification relies on a new Skaffold phase named ‘verify’,. So basically this phase allows developers and operators to add a list of test containers to be run post deployment and monitored for success/failure.

So here, we have added the new verify phase to the skaffold.yaml like below:

The post deployment verification can be used for different targets based on different Skaffold profiles, in our case we have configure both(dev-env, prod-env) the targets to have deployment verification, as shown above.

Types of canary

Google Cloud Deploy supports various canary deployment configurations.

  1. Automated
    Automated canary deployment on Google Cloud Deploy involves configuring progressive deployment percentages, and letting the service handle traffic allocation between old and new versions. In our demo, we’ll use Google Cloud Deploy’s automated canary deployment with configurable progressive deployment percentages, but it can be customised as needed.
  2. Custom-automated
    To create a custom-automated canary on Google Cloud Deploy, you need to specify the phase name, percentage goal, Skaffold profile, and whether to include a verify job. Traffic balancing information is not required, as Google Cloud Deploy creates the necessary resources as described here.
  3. Custom
    A fully custom canary on Google Cloud Deploy involves configuring each phase separately, including phase name, percentage goal, Skaffold profile, and whether to include a verify job. In addition, you need to provide all traffic-balancing configuration for the canary, as described here.

What is Canary Release?

Canary release is a deployment rollout strategy that aims at minimizing new software risks by directing a small percentage of users to the new version of the application. After verifying that the new application works as intended, the traffic directed to it is gradually increased and eventually, all traffic is directed to it. In case an issue is detected at any point throughout the deployment, a rollback is performed with ease because both the current and the new versions are in the production environment.

Credits: VARSHA KASHYAP

High Level Design

The following diagram provides a high-level overview of how the various components described in this blog post are connected to one another.

Credits: VARSHA KASHYAP

We are going to need several resources in our Google Cloud setup. We are using “Artifact Registry to store the container image”. Google Cloud Deploy is used for routing traffic to the different version of application. Cloud Deploy is providing us with a managed continuous delivery pipeline that deploys the application to the various environments.

Environment setup

To follow the instructions outlined in this blog post, it is necessary to have gcloud and skaffold installed on your local system, or alternatively, you can use Cloud Shell as it comes pre-installed with these tools

Before you begin

A. If you’re new to Google Cloud, create an account or use the existing one.

B. As per Google’s best practices, it is not recommended to run applications using the default compute service account. Therefore, it is advised to create a new service account and assign it the minimum level of access required, as per Google’s recommended security practices.

C. To comprehend the Demo Setup, it is necessary to familiarize oneself with all of the concepts listed below.

Setup

  1. Prerequisites & Setup: Follow the steps for the same from Master Readme associated with the git repo(link).
  2. Hence the complete setup is ready. To have a look at the execution steps being performed from CloudBuild Trigger follow the setup steps from my previous blog from step 3 to step 7.
    NOTE: In this blog post we will be using Artifact Registry in place of Google Container Registry.
  3. Let’s go to the delivery pipeline page. Search and click on the pipeline name demo-canary-pipeline. Here, we can see this page shows a graphical representation of your delivery pipeline’s progress. Also, we can see the Promote release dialog is shown along with pending pending Dialog(Approval Required) box in between. It shows the details of the target you’re promoting to. So basically here we have created 2 targets i.e. dev-env
    and prod-env as shown below in the screenshot attached. In Dev Target we have used standard strategy and in the Prod we have used canary strategy.

To check the dev-env deployment/service in the CloudRun follow the step 5to step 7 my previous blog.

Additionally here we should also check the Post-Verification details(new feature in Google Cloud Deploy) by clicking on the ROLLOUTS. We can also see logs for the same by clicking on the link associated with logs as shown below.

4. Next, click on the Promote.

Here when you create a release for a canary deployment, the rollout is created with a phase for each canary increment, plus a final stable phase for 100%.

For example, if you configure a canary for 25% and 50% increments, the rollout will have the following phases:

  • canary-25
  • canary-50
  • stable

This means that if you deploy an application for the first time to a given target, and you use a canary deployment strategy, the rollout might skip the canary phase or phases. “So let’s select canary-25 from the phase. Additionally we can give some descriptive description to the Rollout of this deployment and click on PROMOTE”.

5. Once you click on Promote you can see the Review button will become visible below the pending dialog box. Within the Cloud Deploy Delivery Pipeline, this is a feature called “Required Approval”. This is a manual approval step that can be added to the deployment process which can help ensure that changes are reviewed and approved before they are deployed to production. This parameter allows you to control the deployment of new code to production by requiring an authorised user to manually approve the deployment.

When this feature is enabled in any target, the deployment pipeline will pause at the approval step after the new code has been built and tested. The deployment will remain on hold until an authorized user reviews the changes and approves the deployment.

NOTE: This can help ensure that only authorised users are allowed to deploy code to production, and that all changes are reviewed and approved before they are deployed.However, it is important to note that adding the “Required Approval” parameter can introduce additional time to the deployment process and can impact the speed at which changes are deployed. Therefore, it is important to weigh the benefits and drawbacks of using this feature and to use it only when it is necessary for your particular use case. Additionally, it is important to ensure that the authorised users who are responsible for approving deployments are trained and familiar with the deployment process to avoid any unnecessary delays or errors.

Afterward, you may select the Review button from either the notification pop-up or the pending dialog box as displayed in the above. A confirmation prompt will then be displayed, and Review should be selected again to proceed.

Click on APPROVE. Next go back to the your Delivery Pipeline page from the GCP console.

6. Next select the ROLLOUTS tab and click on Pending advance (latest) Release.

Here we can see, Google Cloud Deploy skips to the stable phase, selecting the “Advance Rollout” option will initiate the stable phase, which will fully deploy the application to the targeted destination.

Application has been deployed to stable phase.

NOTE: After deployment Post-deployment verification step gets executed where Google Cloud Deploy calls Skaffold to verify that an application that we have deployed to with all target is working correctly. Here verification is done using our own testing image, and we have configure Google Cloud Deploy and Skaffold to run those tests after deployment finishes.

7. Again go back to the your Delivery Pipeline page from the GCP console.

At this point, a canary test can be conducted with any further updates. To do so, you can modify the index.html file and push the changes to the image. The pipeline can then be re-run in purpose to deploy the new release.
NOTE: Upon re-executing the deployment pipeline for a new release, the application will be automatically deployed to the initial target in the progression.

8. To PROMOTE the new version in canary phase repeat above step 4 and step 5.

Likewise we can Advance to canary-50.

NOTE: Verify the canary deployment of a revision by inspecting the Cloud Run service page.

Summary

To minimize downtime and performance issues when deploying software updates, it is crucial to use rollout strategies that allow for immediate rollbacks in case of any issues. Canary releases are a recommended approach for this purpose, as they facilitate gradual deployment while enabling a switch back to the previous version if problems occur. They are particularly useful for frequent updates since each canary increment is deployed in a distinct phase, leading to a final stable phase that reaches 100% deployment. The update also includes post-deployment verification, which allows developers and operators to test the deployment after its completion without requiring extensive integration testing. These features simplify the advanced deployment process for Cloud Run, GKE, and Anthos, providing production-ready solutions for smoother implementation.

Questions?

If you have any questions, I’ll be happy to read them in the comments.

Follow me on Medium and LinkedIn.

Happy Learning !! Thank you for reading 👍🏻

Happy Clouds☁☁️

--

--

☁️DevOps Eng. experienced in cloud app/svc management & automated infra deployment/configuration. Proficient in cloud infra with a passion for DevOps culture.