# Getting Started with Karpenter IBM Cloud Provider This guide walks you through setting up Karpenter IBM Cloud Provider from installation to your first auto-scaled workload. ## Prerequisites Before starting, ensure you have: ### Required Access - **IBM Cloud Account** with VPC Infrastructure Services access - **Kubernetes Cluster** (IKS or self-managed on IBM Cloud VPC) - **kubectl** configured for your cluster - **Helm 3** for installation ### Required Tools ```bash # Verify tools are available kubectl version --client helm version ibmcloud version # Install IBM Cloud CLI if needed curl -fsSL https://clis.cloud.ibm.com/install/linux | sh ibmcloud plugin install vpc-infrastructure ``` ## IBM Cloud Setup ### Step 1: Create Service ID and API Keys For production environments, use Service IDs for better security: ```bash # Login to IBM Cloud ibmcloud login # Create Service ID for Karpenter ibmcloud iam service-id-create karpenter-provider \ --description "Service ID for Karpenter IBM Cloud Provider" # Get Service ID SERVICE_ID=$(ibmcloud iam service-ids --output json | jq -r '.[] | select(.name=="karpenter-provider") | .id') # Assign VPC Infrastructure Services role ibmcloud iam service-policy-create $SERVICE_ID \ --roles "VPC Infrastructure Services" \ --service-name is # Create API keys ibmcloud iam service-api-key-create karpenter-general $SERVICE_ID \ --description "General IBM Cloud API access for Karpenter" ibmcloud iam service-api-key-create karpenter-vpc $SERVICE_ID \ --description "VPC-specific API access for Karpenter" ``` **Save the API keys securely - they won't be shown again!** ### Step 2: Gather Required Resource Information ```bash # Set your target region export REGION=us-south ibmcloud target -r $REGION # List available VPCs ibmcloud is vpcs --output json # Choose your VPC and list subnets export VPC_ID="your-vpc-id" ibmcloud is subnets --vpc $VPC_ID --output json # List security groups in your VPC ibmcloud is security-groups --vpc $VPC_ID --output json # List available images ibmcloud is images --visibility public --status available | grep ubuntu ``` **Collect the following information:** - **VPC ID**: `vpc-12345678` (from VPC list) - **Subnet ID**: `subnet-12345678` (choose one per zone you want to use) - **Security Group ID**: `sg-12345678` (existing or create new) - **Image ID**: `r006-12345678` (Ubuntu 20.04 recommended) - **Region**: `us-south` (or your preferred region) - **Zone**: `us-south-1` (subnet's availability zone) ## Installation ### Step 1: Create Kubernetes Secrets Store your API keys securely in Kubernetes: ```bash # Create namespace kubectl create namespace karpenter # Create secret with API keys kubectl create secret generic karpenter-ibm-credentials \ --from-literal=api-key="your-general-api-key" \ --from-literal=vpc-api-key="your-vpc-api-key" \ --namespace karpenter # Verify secret creation kubectl get secret karpenter-ibm-credentials -n karpenter ``` ### Step 2: Install Karpenter IBM Cloud Provider **Option A: Using Helm (Recommended)** ```bash # Add Helm repository helm repo add karpenter-ibm https://pfeifferj.github.io/karpenter-provider-ibm-cloud helm repo update # Install with environment variables helm install karpenter karpenter-ibm/karpenter \ --namespace karpenter \ --create-namespace \ --set controller.env.IBM_REGION="us-south" \ --set controller.env.IBM_API_KEY.valueFrom.secretKeyRef.name="karpenter-ibm-credentials" \ --set controller.env.IBM_API_KEY.valueFrom.secretKeyRef.key="api-key" \ --set controller.env.VPC_API_KEY.valueFrom.secretKeyRef.name="karpenter-ibm-credentials" \ --set controller.env.VPC_API_KEY.valueFrom.secretKeyRef.key="vpc-api-key" ``` **Option B: Using kubectl and manifests** ```bash # Download manifests curl -O https://raw.githubusercontent.com/pfeifferj/karpenter-provider-ibm-cloud/main/charts/karpenter/templates/deployment.yaml # Apply with your values kubectl apply -f deployment.yaml ``` ### Step 3: Verify Installation ```bash # Check if pods are running kubectl get pods -n karpenter # Check controller logs for startup kubectl logs -n karpenter deployment/karpenter -f # Look for successful controller startup messages # Expected: "Starting Controller" for nodepool, nodeclaim, nodeclass controllers ``` **✅ Expected Output:** ``` Starting Controller {"controller": "nodepool", "controllerGroup": "karpenter.sh"} Starting Controller {"controller": "nodeclaim", "controllerGroup": "karpenter.sh"} Starting Controller {"controller": "nodeclass.hash", "controllerGroup": "karpenter.ibm.sh"} Starting Controller {"controller": "pricing", "controllerGroup": "karpenter.ibm.sh"} ``` ### Step 4: Create Your First NodeClass Choose the configuration that matches your environment: **📝 Replace the placeholder values with your actual resource IDs:** ```bash # Create basic NodeClass cat <> /etc/environment # The bootstrap script is automatically appended ``` ## Troubleshooting Quick Fixes ### NodeClass Not Ready ```bash # Check validation errors kubectl describe ibmnodeclass default-nodeclass # Look for validation messages in status kubectl get ibmnodeclass default-nodeclass -o yaml # Common issues and solutions: # ❌ Invalid VPC ID → Verify VPC exists in region # ❌ Invalid subnet ID → Check subnet exists in specified zone # ❌ Invalid security group → Ensure security group exists in VPC # ❌ Invalid image ID → Use valid IBM Cloud image ID # ❌ Missing API permissions → Check Service ID roles ``` ### Nodes Not Provisioning ```bash # Check NodeClaim status and events kubectl get nodeclaims -o wide kubectl describe nodeclaim # Check controller logs for specific errors kubectl logs -n karpenter deployment/karpenter --tail=100 | grep ERROR # Common issues: # ❌ Quota exceeded → Contact IBM Cloud support for quota increase # ❌ Instance type unavailable → Try different instance types in NodePool # ❌ Subnet has no available IPs → Use larger subnet or multiple subnets ``` ### Pods Still Pending ```bash # Check pod resource requirements kubectl describe pod # Verify NodePool can satisfy requirements kubectl describe nodepool # Check if instance types can handle the workload kubectl get nodepools -o yaml | grep -A 10 requirements # Common solutions: # ✅ Increase NodePool limits (cpu/memory) # ✅ Add more instance types to requirements # ✅ Check pod resource requests are reasonable # ✅ Verify anti-affinity rules aren't too restrictive ``` ### Bootstrap Failures ```bash # Check instance bootstrap logs (requires SSH access) ssh ubuntu@ "sudo journalctl -u cloud-final" ssh ubuntu@ "sudo tail -f /var/log/cloud-init-output.log" # Check kubelet status on node ssh ubuntu@ "sudo systemctl status kubelet" # Common issues: # ❌ Network connectivity → Check security groups allow cluster communication # ❌ Bootstrap token expired → Check controller logs, automatic renewal should handle this # ❌ DNS resolution → Verify VPC DNS settings ``` ## Next Steps ### Scale Testing Test your setup with larger workloads: ```bash # Scale up the test deployment kubectl scale deployment auto-scaling-test --replicas=10 # Create varying workload sizes kubectl apply -f - <