Best practices for architecting cost-effective and scalable Apigee X PayG Organisation in GCP

Ayush Gupta
Google Cloud - Community
8 min readDec 20, 2022

--

Apigee X is a SaaS based API Gateway running on Google Cloud Platform. Apart from being a reverse proxy, Apigee X provides API Publishing, Analytics, and Monetisation capabilities. Apigee X as an API Management Platform for applications proves extremely useful due to quick deployment, scalability, and a wide range of API Proxy operations.

Getting started with and architecting a resilient yet flexible Apigee X Organisation can be a daunting task, as being a customer you can have unique requirements and there is no one-size-fits-all approach towards architecting solutions. An Apigee X Organisation needs to align with the corresponding GCP Organisation to deliver robust API Management capabilities. Scalability and cost-effectiveness go hand-in-hand as Apigee X is one of the costlier resources in GCP. Apigee X Organisation design should be consistent and flexible enough with API consumption rate to drive maximum value.

A well-planned Apigee X setup allows for easy scalability, future-proof and cost-effective API operations. Below is the checklist for key decision areas which needs to be planned before provisioning an Apigee X organisation.

Selecting Apigee X Billing Type

Alignment with GCP Resource Hierarchy

Network Design Requirements

Provisioning Roles and Permissions for Apigee X

Apigee X Organization Hierarchy

Let’s dive deep into each step.

  1. Selecting the Apigee X Billing Plan

Apigee X offers three billing plans for Apigee X: Evaluation (Eval), Pay-As-You-Go (PayG), & Subscription (License Based).

Apigee X Organisation of Eval type is provisioned for 60 days only. Apart from the short duration, Eval plan has an infrastructure restriction of having a single Instance (single region) which fails to provide a multi-region setup for Business Continuity Planning (BCP) / Disaster Recovery (DR). Thus, it is not a recommended type for business organizations.

PayG and Subscription Types are the recommended choices for organisations.

PayG is the recently introduced billing type for Apigee X (PayG billing model doesn’t apply to Apigee Hybrid) in Google Cloud. Customers can now pay only for Apigee X infrastructure required to support current API consumption. The infrastructure handles predetermined minimum API calls and can be scaled manually and automatically to support increasing API traffic.

PayG is a good choice under the following scenarios, more than one can be applicable here.

  • Customers are in early phases of Application Modernisation/Migration to cloud. New services are being developed and continuously added to support core features of the application
  • Backend application or microservices are under Continuous Integration/Continuous Development phase in their DevOps cycle
  • Establishing (Gauging) production API request rate/volume (TPS) is still far away a target and the application is under non-production or similar environments
  • Business Continuity Planning (BCP) / Disaster Recovery (DR) takes a backseat as the primary requirement is developing the backend services.
  • Advanced API features such as API Monetisation, Application Integration, and Advance API Operations are not needed
  • PayG is also suitable for digital natives who are experimenting with Cloud Based API Management Platform and want to keep the associated cost in control.

Contrast that with the Subscription billing type, where a billing plan is purchased before starting with Apigee as an API Gateway. The Subscription based billing plan is based on the number of annual API requests or call volumes, analytics data retention, and other network related pricing. There are 3 standard offerings for Subscription type i.e. Standard, Enterprise and Enterprise Plus. The Subscription proves to be cost-effective as an established annual API rate and GCP network cost (egress/ingress/premium/standard tier) are clubbed along with other related Apigee services in the overall subscription based plan.

Subscription Plan is preferred under the following circumstances, more than one can be applicable here.

  • Customers have estimated their annual API call volume or API rates(TPS)
  • The Business Continuity Planning (BCP) / Disaster Recovery (DR) is a requirement as the backend application and related cloud infrastructure are stable and mature for a production based environment.
  • Continuous Integration/Continuous Deployment is the CI/CD phase that best describes the application lifecycle
  • Additional features such as API Monetisation, Application Integration, and Advance API operations are needed by customers

The rest of the article will focus on PayG as a billing plan as it is majorly used and offers benefits of scalability and low-cost provisioning when starting or beginning with Application Modernisation/Migration.

2. Compatibility with GCP resource hierarchy

As a thumb rule, 1 GCP project can map to 1 Apigee X Organisation only. GCP resource hierarchy can be primarily arranged according to different Software Development Life Cycles (SDLC). It is recommended to keep a minimum of 2 different Apigee Organisations for Production and Non-Production environments of SDLC, respectively, rather than having a single Apigee Organization shared across the entire SDLC. The advantages for this approach are many

  • Separate Apigee Organisations give the flexibility of High Availability (HA) / Disaster Recovery (DR) planning for only required SDLC environments thus saving overall costs.
  • Accidental deployment/alteration of production resources by developer and testing teams is avoided. Separation of Apigee Organisations ensures access is given to relevant teams only.
  • Certain regional regulations and company codes require production and non-production environment data to be kept separate. Having different Apigee Organisations for each environment ensures that access to relevant personnel is granted separately using GCP IAM Policies.
  • Production specific Apigee Organisation can have an advance billing plan such as Subscription plan with premium add-on of API Monetisation, Integration and Advance API operations
  • Developer Portals, API keys etc. can be kept separate for production and non-production environments. This is consistent with certain regulatory security requirements of having API keys as separate for both environments.
  • Flexibility of using advanced developer portals such as Drupal for Production and Apigee Integrated Developer Portals for Non-production Apigee organisation is there.
  • Apigee Organisation has a product limit of 10 Instances (one per unique region), 85 Environment, 85 Environment Groups etc. Detailed product limits can be found here. Having Production and Non-production as separate Apigee Organisations ensures product limits are not breached.
  • Production Apigee X Organisation can be provisioned later when the Application is finally ready for moving to production.
Apigee X -GCP Resource Hierarchy Reference

3. Network Design Requirements

Service Networking is used to provision networking for SaaS based offerings on Google Cloud. Notable SaaS offerings provisioned using Service Networking include Apigee X, Cloud IDS, Cloud SQL, etc. Service Networking uses network peering in GCP to connect ingress/egress of the host VPC with Apigee X (tenant project in GCP). Private Service Access is used to reserve IP addresses within Host VPC for sharing across the peer Saas project such as Apigee X.

VPC Planning

As a best practice there should be at least 2 VPCs for each Production and Non-Production Phase of each SDLC. Each VPC will be used to provision networking for Apigee X organisations for the corresponding SDLC phase. Different VPCs help to keep production and non-production traffic separate and meet regulatory requirements.

VPC is a global resource. Network Administrators can provision and reserve IP ranges for particular use cases such as subnet planning, static IPs, Private Service Access etc. Always provision a VPC with custom subnets and delete VPCs with auto-mode subnets, if any.

Apigee X Runtime Instances requires a minimum of /22 subnet ranges per Instance (for Apigee runtime) and /28 subnet per Instance (for troubleshooting). Troubleshooting range can be selected manually as well as automatically from the peered Apigee range. However, in case of automatic allocation, some IP ranges should be left uninitialised for Service Networking to provision troubleshooting ranges.

For example, GCP Network Administrators provides 10.224.0.0/16 to provision Apigee X Runtime.

10.224.0.0/16 => 00001010.11100000.000000 00.00000000 (Binary Representation)

This Subnet Range can be divided as

Instances

Apigee X Instance 1: 10.224.0.0/22 => 00001010.11100000.00000000.00000000

Apigee X Instance 2: 10.224.4.0/22 => 00001010.11100000.00000100.00000000

Apigee X Instance 10: 10.224.36.0/22 => 00001010.11100000.00100100.00000000

Number of Instances can be added as per need. Total of 10 Instances per Apigee organisation can be provisioned in this example providing scalability. If more instances are needed, a separate Apigee organization needs to be added. As of this writing, Dec 2022, 10 instances per Apigee organisation is the limit.

Apigee Troubleshooting Ranges/OtherResources(not initialised):

10.224.128.0/17 => 00001010.11100000.10000000.00000000

Apigee X Networking Reference

Best Practices For IP Address Allocation for Apigee X

  • Address range used can be a Private IPv4 range(RFC 1918)
  • If planned Private IPv4 address space is not sufficient or can be exhausted early, it is recommended to use shared address space RFC 6598.
  • Please select the sufficient address space for holding subnets for frontend resources, Apigee X runtime instances(PayG), subnets for backend resources and other GCP resources as per organisation need.
  • Selecting an IP address for GCP resources should not conflict with on-prem/other cloud network subnets, connectivity provisioned via Cloud Interconnect(s) and Cloud VPN(s)

4. Provisioning Roles and Permissions for Apigee X

For Apigee organisation, permissions can be given by assigning IAM groups with required roles from within GCP organisation.

IAM groups are a recommended way of provisioning roles -

  • Permissions for Apigee X users can be managed at scale by assigning roles to groups rather than assigning to individual users.
  • In case of revoking a user’s access for specific Apigee permission, the user can be removed from the corresponding IAM group.
  • Custom Roles for Apigee X are also available in GCP IAM. Custom Roles provide fine-grained access and IAM groups can be provisioned according to custom roles.

Following IAM groups with below permissions are recommended to begin with. New groups can be added according to the custom business requirements and having fine-grained Apigee X permissions.

Apigee X Recommended Roles and IAM Groups

5. Apigee X Organization Hierarchy

Apigee X Organisation is different from GCP organisation in terms of resource hierarchy. Apigee X Organisation comprise of following components

  • Instances — Regional resources where different environments are provisioned
  • Environment — Execution context where the API proxies will be deployed
  • Gateway Node- Unit of an environment that processes API traffic. As traffic increases, more nodes are required for the environment
  • Environment Groups — Logical grouping and mapping of Apigee Environments with hostnames. Each group can have 1 or more hostnames assigned that routes incoming API requests to appropriate environment

Recommendations while setting up Apigee X hierarchy

  • An Apigee X Instance is a regional resource. For achieving SLA of 99.99% for disaster recovery setups, more than one instance or multi-regional instances should be provisioned.
  • Environments for Production or similar SDLC phases should be set across multi-regional instances for fault tolerance and disaster recovery
  • Apigee Environment is implemented using Nodes acting as Apigee Runtime. Each Instance provides minimum 2 nodes per Apigee Environment and maximum 1000 nodes across all Apigee Environments.
  • Number of Gateway nodes is one of the components of PayG Billing. By default 2 Node instances are there as minimum nodes. Maximum nodes can be set manually according to requirements or can be autoscaled.
  • Each Apigee Node can process 300 TPS (Transaction Per Second) in ideal test conditions. Real time performance of the Node may vary depending on the API Proxy performance and network configurations. To save cost, it is recommended to estimate the maximum number of nodes required for an environment using 300 TPS API volume per Node(It is assuming ideal lab conditions. Please modify the TPS accordingly for estimation). Setting a manual maximum limit of nodes is a good way to keep costs in check.
Apigee X Organisation Structure Reference

After architecting and planning for the above inputs (Billing plan, GCP resource hierarchy, Network planning, Roles and Permissions, Apigee X Resource hierarchy) in GCP, Apigee can be provisioned via Apigee Console or using Terraform Apigee modules.

--

--