# Multi-Zone Prerequisites for Karpenter IBM Provider This document outlines the required VPC networking configuration for enabling multi-zone node provisioning with Karpenter IBM Provider. ## Overview For Karpenter to successfully provision nodes across multiple availability zones, the VPC must be configured to allow cross-subnet communication between zones. This includes both security group rules and VPC routing configuration. ## Required VPC Configuration ### 1. Security Group Rules The security group used by Karpenter nodes must include rules for **ALL** subnets where nodes will be provisioned, including the API server subnet. **Required outbound rules** (from worker nodes to API server): ```bash # API server access (port 6443) ibmcloud is security-group-rule-add outbound tcp --port-min 6443 --port-max 6443 --remote # Kubelet access (port 10250) ibmcloud is security-group-rule-add outbound tcp --port-min 10250 --port-max 10250 --remote # Calico BGP (port 179) ibmcloud is security-group-rule-add outbound tcp --port-min 179 --port-max 179 --remote ``` **Required inbound rules** (from API server to worker nodes): ```bash # API server access (port 6443) ibmcloud is security-group-rule-add inbound tcp --port-min 6443 --port-max 6443 --remote # Kubelet access (port 10250) ibmcloud is security-group-rule-add inbound tcp --port-min 10250 --port-max 10250 --remote # Calico BGP (port 179) ibmcloud is security-group-rule-add inbound tcp --port-min 179 --port-max 179 --remote ``` ### 2. VPC Routing Rules VPC routing rules are required to enable communication between subnets in different zones. **Example configuration for 3-zone setup:** ```bash # Get VPC routing table ID VPC_NAME="your-vpc-name" ROUTING_TABLE_ID=$(ibmcloud is vpc $VPC_NAME --output json | jq -r '.default_routing_table.id') # Routes from Zone 1 to API server zone ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-zone1-to-api \ --zone eu-de-1 \ --destination \ --action deliver \ --next-hop # Routes from Zone 3 to API server zone ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-zone3-to-api \ --zone eu-de-3 \ --destination \ --action deliver \ --next-hop # Return routes from API server zone to worker zones ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-api-to-zone1 \ --zone eu-de-2 \ --destination \ --action deliver \ --next-hop ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-api-to-zone3 \ --zone eu-de-2 \ --destination \ --action deliver \ --next-hop ``` ### 3. Gateway IP Conventions IBM VPC uses the first IP address in each subnet as the gateway: - Subnet `10.243.0.0/24` → Gateway `10.243.0.1` - Subnet `10.243.65.0/24` → Gateway `10.243.65.1` - Subnet `10.243.128.0/18` → Gateway `10.243.128.1` ## Real Example Configuration Based on our `eu-de-default-vpc` setup: ### Subnet Layout - **eu-de-1**: `10.243.0.0/24` (worker nodes) - **eu-de-2**: `10.243.65.0/24` (API server) + `10.243.64.0/24` (other workloads) - **eu-de-3**: `10.243.128.0/18` (worker nodes) ### Applied Security Group Rules ```bash SECURITY_GROUP_ID="r010-36f045e2-86a1-4af8-917e-b17a41f8abe3" API_SERVER_SUBNET="10.243.65.0/24" # Outbound rules ibmcloud is security-group-rule-add $SECURITY_GROUP_ID outbound tcp --port-min 6443 --port-max 6443 --remote $API_SERVER_SUBNET ibmcloud is security-group-rule-add $SECURITY_GROUP_ID outbound tcp --port-min 10250 --port-max 10250 --remote $API_SERVER_SUBNET ibmcloud is security-group-rule-add $SECURITY_GROUP_ID outbound tcp --port-min 179 --port-max 179 --remote $API_SERVER_SUBNET # Inbound rules ibmcloud is security-group-rule-add $SECURITY_GROUP_ID inbound tcp --port-min 6443 --port-max 6443 --remote $API_SERVER_SUBNET ibmcloud is security-group-rule-add $SECURITY_GROUP_ID inbound tcp --port-min 10250 --port-max 10250 --remote $API_SERVER_SUBNET ibmcloud is security-group-rule-add $SECURITY_GROUP_ID inbound tcp --port-min 179 --port-max 179 --remote $API_SERVER_SUBNET ``` ### Applied VPC Routes ```bash VPC_NAME="eu-de-default-vpc" ROUTING_TABLE_ID="r010-a05acc7b-d314-4aae-9362-290a77d6b653" # Zone 1 to API server ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-to-api-server \ --zone eu-de-1 \ --destination 10.243.65.0/24 \ --action deliver \ --next-hop 10.243.0.1 # Zone 3 to API server ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-to-api-server-from-zone3 \ --zone eu-de-3 \ --destination 10.243.65.0/24 \ --action deliver \ --next-hop 10.243.128.1 # API server to Zone 1 ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-from-api-server \ --zone eu-de-2 \ --destination 10.243.0.0/24 \ --action deliver \ --next-hop 10.243.65.1 # API server to Zone 3 ibmcloud is vpc-routing-table-route-create $VPC_NAME $ROUTING_TABLE_ID \ --name route-from-api-server-to-zone3 \ --zone eu-de-2 \ --destination 10.243.128.0/18 \ --action deliver \ --next-hop 10.243.65.1 ``` ## Troubleshooting ### Common Issues 1. **Nodes fail to join cluster**: Check that security group rules allow port 6443 and 10250 between all subnets 2. **Connection timeouts**: Verify VPC routing rules exist for bidirectional communication 3. **Networking inconsistency**: Ensure routes are created for both directions (to and from API server) ### Verification Steps 1. **Test connectivity from worker node to API server**: ```bash # SSH to worker node ssh -i ~/.ssh/key root@ # Test API server connectivity ping nc -zv 6443 ``` 2. **Check kubelet logs**: ```bash systemctl status kubelet journalctl -u kubelet -f ``` 3. **Verify routes are active**: ```bash ibmcloud is vpc-routing-table-routes --vpc ``` ## Prerequisites Summary ✅ **Security Groups**: Configure rules for all ports (6443, 10250, 179) between all subnets ✅ **VPC Routes**: Add bidirectional routes between worker node subnets and API server subnet ✅ **Network ACLs**: Ensure allow-all rules (default in most setups) ✅ **Subnet Configuration**: All subnets use the same VPC routing table Once these prerequisites are met, Karpenter can successfully provision nodes across multiple availability zones and workloads can be scheduled on the multi-zone infrastructure. ## Implementation Status (2025-09-28) - ✅ Security group rules added - ✅ VPC routing rules created - 🔄 Testing connectivity (in progress) - ⏳ Node registration validation (pending)