Keeping Bots outside your edge

Published in

Google Cloud - Community

8 min readJan 6, 2023

How to protect your apps from malicious bots with Google Cloud Armor and reCaptcha Enterprise

In an ever changing threat landscape, it is already quite a task to keep track of all the threats and vulnerabilities exposed to your business. As threat actors evolve to more sophisticated and automated attack vectors its keeping more CISOs up in the night. In a recent study on Bots Management , 60% of businesses detecting bot attacks on APIs and 39% detecting attacks on mobile apps (up from 46% and 23% from 2021 respectively). Around 97%, report that customer satisfaction has been affected by bot attacks.

So as more and more businesses are looking to invest more in Bot mitigation tools, Google ReCaptcha has provided a great tool to fight the army. reCaptcha has helped thousands of business, small and large, to protect from billions of bot attacks globally.

Well this is great, but it comes at a cost !

Let me explain, while ReCaptcha does successfully identify humans from bots using an advanced risk analysis engine and adaptive challenges and has mitigated many bot attacks, using it alone does have certain disadvantages like

Increased load at the web layer
Need Custom app integrations for each application
Increased traffic through network security devices at the perimeter

Okay, so how do I use my existing ReCaptcha investments and integrate at the Edge Layer?

In comes Google Cloud Armor’s native ReCaptcha Enterprise integration!

Google Cloud Armor is a WAF layer which is used for DDoS and other Layer 3/4 attacks which works in the edge layer of Google cloud global networks. When integrated with reCaptcha it integrates its bot management features at a global scale across Google’s global network.

This means no matter from where the bots are targeting your systems it never reaches your datacenter and gets killed at one of the 176 Network PoP nearest to the threat actor.

Also since the policies are applied at a global level, you will configure it once and it will be applied to all your needed application.

So how do I integrate ReCaptcha Enterprise to Cloud Armor?

Let me walk you through the integration process in GCP. Below I will include steps both from a console as well using command line.

Note: The below steps assumes familiarity with the key concepts of Cloud Armor and ReCaptcha. If you are not familiar with the technologies I encourage you to go through the documentations for Armor and ReCaptcha before going through the next steps

Pre-requisites

A GCP Project with a valid billing account or signup for a GCP trial
Enable all Necessary GCP Apis on the project

gcloud services enable compute.googleapis.com
gcloud services enable logging.googleapis.com
gcloud services enable monitoring.googleapis.com
gcloud services enable recaptchaenterprise.googleapis.com

A Web server fronted by a GCP HTTP(S) Load Balancer with necessary website configurations to allow connectivity though a public front end IP. Please make sure necessary firewall rules are in place for connectivity to the backend servers and Health check rules. Here is a How-to guide which you can refer

ReCaptcha Assessment Types

There are two major ways through which ReCaptcha assesses and protects at he Edge layer:

Challenge Page assessment

Here ReCaptcha presents a challenge page to the end user requiring an user interaction. We all must have seen this at some point in our browsing history

Below diagram shows the workflow:

2. Frictionless assessment

This is a score based automated assessment where the end users are not prompted with anything resulting in a smoother UX. The integration workflow looks like this:

Score based frictionless assessment is a better approach in most use cases which provides a smoother UX and better accuracy of assessments

Configuring ReCaptcha Keys

In this guide we will be going through the configuration of both score based and challenge based assessment approach by using session and challenge-page keys respectively

Create reCaptcha Session key

In GCP Console, select the correct project
From the hamburger menu browse to Security -> reCaptcha Enterprise. Click on Create Key
In the creation page, fill in the following parameters

Please note “Disable Domain verification” is enabled only if you dont have a valid domain for your website ( for initial testing ). For production environments it is highly recommended to have your domains added to prevent misuse of your reCaptcha keys

Also for the demo purpose we have enabled the “This is a testing key” to help simulate a Bot score of 0.5

The CLI command is as below:

#Set your project context
gcloud config set project <projectID>

#Create session key
gcloud recaptcha keys create --display-name=test-session-key \
   --web --allow-all-domains --integration-type=score --testing-score=0.5 \
   --waf-feature=session-token --waf-service=ca

Once the key is created you should see a “Key ID” which will be used in a later step

Create reCaptcha Challenge-Page key

For challenge key, there are two changes, We select the WAF Feature as “Challenge Feature” and we don’t need to enable the Test key parameter since we will be simulating the challenge page redirection for all connections using a security policy rule in Cloud Armor

CLI:

#Configure reCaptcha session key
gcloud recaptcha keys create --display-name=challenge-key \
   --web --allow-all-domains --integration-type=INVISIBLE \
   --waf-feature=challenge-page --waf-service=ca

Please keep a note of the key ID for later usage.

Integrating session key to the website

The reCAPTCHA JavaScript sets a reCAPTCHA session-token as a cookie on the end-user’s browser after the assessment. The end-user’s browser attaches the cookie and refreshes the cookie as long as the reCAPTCHA JavaScript remains active. Session-tokens are valid for 30 minutes by default. However, if the user stays on the page where you implemented the session-token, reCAPTCHA Enterprise refreshes the session-token periodically to prevent it from expiring.

The session token site key can be integrated with your webpage by adding the following javascript block. Make sure to replace the <SESSION_TOKEN_SITE_KEY> with the session key ID we generated in the previous step

<script src="https://www.google.com/recaptcha/enterprise.js?render=<SESSION_TOKEN_SITE_KEY>&waf=session" async defer></script>

Configure Cloud Armor

Configuring Cloud Armor requires following high level steps

Configure Security Policies -> Configure Rules -> Attach policy to Load balancer backend service

In GCP Console, select the correct project
From the hamburger menu browse to Network Security -> Cloud Armor Click on Create Policy
Configure the following policy parameters

Please note that policy type should Backend security policy advanced bot management capabilities features only available in this type. To understand the differences please refer here

The rules defined below are based on two conditions — URL Path and Session score. Since we have already defined a session score of 0.5 while we created the session keys, this value acts as a threshold to trigger different rule actions of Allow, Deny or Redirect

Add the following rules to the policy

Allow good traffic — **Allow Good traffic**

After configuring all rules you should 4 rules as below :

Apply the Policy to the load balancer backend as a target

Click on Create Policy

CLI:


#Add Security Policy
gcloud compute security-policies create test-botmgmt-policy \
    --description "policy for bot management"

#Add challenge-page key to policy
gcloud compute security-policies update test-botmgmt-policy \
   --recaptcha-redirect-site-key "<CHALLENGE-KEY>"

#Add Security rule to allow access
gcloud compute security-policies rules create 3000 \
     --security-policy test-botmgmt-policy\
     --expression "request.path.matches('safescore.html') &&    token.recaptcha_session.score > 0.4"\
     --action allow

#Add Security rule to deny access
gcloud compute security-policies rules create 2000 \
     --security-policy test-botmgmt-policy\
     --expression "request.path.matches('unsafescore.html') && token.recaptcha_session.score < 0.6"\
     --action "deny-403"

#Add Security rule to redirect to challenge-page
gcloud compute security-policies rules create 1000 \
     --security-policy test-botmgmt-policy\
     --expression "request.path.matches('thresholdscore.html') && token.recaptcha_session.score == 0.5"\
     --action redirect \
     --redirect-type google-recaptcha

#Attach policy to target
gcloud compute backend-services update http-backend \
    --security-policy test-botmgmt-policy --global

Testing reCaptcha in your website

For the purpose of the demo, I have created a very simple website running on Apache in an Ubuntu machine with a main page as index.html and 3 more web pages linked to the main index page. I am attaching the sample HTML definition for anyone wanting to recreate the similar setup, do remember to replace your session token key ID:

#index.html

<!doctype html>
 <html>
  <head>
   <title>ReCAPTCHA protected page</title>
   
   <script src="https://www.google.com/recaptcha/enterprise.js?render=<REPLACE_TOKEN_HERE>&waf=session" async defer>
   </script>
  </head>
   <body>
    <h1>Main Page</h1>
        <p>
           <a href="/safescore.html">Visit allowed link
          </a>
        </p>
        <p>
            <a href="/unsafescore.html">Visit blocked link
            </a>
        </p>
        <p>
            <a href="/thresholdscore.html">Visit redirected link
            </a>
        </p>
      </body>
    </html>

It should look this.

Notice there is a reCaptcha icon at right bottom corner of the page indicating its protected with it.

Now based on the rules we created, first and second link use session scores for frictionless assessment. The allowed link accepts the request since the rule says session score need to be above 0.4 ( we had set a test score of 0.5 previously )

For the second link, we have set a rule to block anything below a score of 0.6

The third link redirects the user to a manual reCaptcha challenge page and the user need to manually resolve the challenge presented as below:

Please note, not every time you will be presented with the picture challenge. reCaptcha uses its algorithm to identify if the connection requires it, otherwise you will simply see a quick redirection page and will automatically redirect back to web page

Limitations

reCAPTCHA Enterprise for WAF with Google Cloud Armor is only supported with the global external HTTP(S) load balancer (classic). It is not yet supported with the global external HTTP(S) load balancer with advanced traffic management capability. For more information, see the External HTTP(S) load balancer overview: Modes of operation.

Conclusion

If you have stick around so far, I hope this helped you understand more on the Bot management capacities of reCaptcha when integrated at the edge layer using Cloud Armor.

Thats it for now :)

Please leave a comment if you have found this useful or have any questions.

Till next time…