# Custom Resources Configuration Guide This guide explains how to configure Karpenter IBM Cloud Provider Custom Resources (NodePools and IBMNodeClasses) using the Helm chart's configurable CR system. ## Overview The Helm chart now supports automatic creation and configuration of: - **IBMNodeClass**: IBM-specific node configuration (VPC settings, IKS worker pools, images, etc.) - **NodePool**: Karpenter node pool definitions with multiple pools support - **Mode-based configuration**: Different settings for IKS vs VPC deployments ## Configuration Modes ### VPC Mode For self-managed Kubernetes clusters running on IBM Cloud VPC infrastructure. **Key Features**: - Direct VPC instance provisioning - Flexible subnet and security group selection - Custom boot volume configuration - Advanced networking options - Direct kubelet bootstrap ### IKS Mode For IBM Kubernetes Service (IKS) managed clusters with worker pool integration. **Key Features**: - IKS worker pool management - IBM-managed infrastructure - Integrated cluster lifecycle - IKS API-based provisioning ## Basic Configuration ### Enable Custom Resources ```yaml customResources: enabled: true mode: "vpc" # or "iks" ``` ### Mode Selection #### VPC Mode ```yaml customResources: mode: "vpc" nodeClass: vpc: vpcId: "r010-12345678-1234-5678-9abc-def012345678" ``` #### IKS Mode ```yaml customResources: mode: "iks" # Also required at root level iksClusterID: "bng6n48d0t6vj7b33kag" ``` ## Global Configuration Configure defaults that apply to all NodePools: ```yaml customResources: global: # Default instance types instanceTypes: - "bx2-2x8" - "bx2-4x16" - "bx2-8x32" # Default capacity type capacityType: "on-demand" # "on-demand" or "spot" # Default architecture architecture: "amd64" # "amd64" or "arm64" # Default disruption settings disruption: consolidationPolicy: WhenUnderutilized consolidateAfter: 30s expireAfter: 720h # Default resource limits limits: cpu: 1000 memory: 1000Gi # Default taints for all nodes taints: - key: "example.com/special-nodes" value: "true" effect: "NoSchedule" ``` ## NodeClass Configuration ### VPC NodeClass ```yaml customResources: nodeClass: enabled: true name: "vpc-nodeclass" vpc: # Required: VPC ID vpcId: "r010-12345678-1234-5678-9abc-def012345678" # Subnet selection subnetSelection: strategy: "zone-balanced" # "all", "zone-balanced", "cost-optimized" subnetIds: # Optional: specific subnets - "subnet-123" - "subnet-456" zones: # Optional: specific zones - "us-south-1" - "us-south-2" # Security groups securityGroups: strategy: "auto" # "auto" or "manual" groupIds: # For manual strategy - "sg-123" - "sg-456" # Boot volume bootVolume: volumeType: "general-purpose" # "5iops-tier", "10iops-tier", "custom" size: 100 # GB encrypted: true iops: 3000 # For custom volume type # Network interface networkInterface: allowIpSpoofing: false primaryInterfaceSubnetStrategy: "zone-balanced" common: # Image selection image: strategy: "latest" # "latest", "specific", "family" imageId: "r010-12345678-1234-5678-9abc-def012345678" # For specific family: "ubuntu-minimal" # For family operatingSystem: "ubuntu" architecture: "amd64" # Custom user data userData: | #!/bin/bash echo "Custom initialization script" # Resource tags tags: Environment: "production" Team: "platform" ManagedBy: "karpenter" ``` ### IKS NodeClass ```yaml customResources: nodeClass: enabled: true name: "iks-nodeclass" iks: workerPoolTemplate: machineType: "bx2.4x16" diskEncryption: true operatingSystem: "UBUNTU_20_64" common: image: strategy: "family" family: "ubuntu" operatingSystem: "ubuntu" architecture: "amd64" tags: Environment: "production" IKSCluster: "true" ``` ## Multiple NodePools Configure multiple NodePools for different workload types: ```yaml customResources: nodePools: # General purpose pool - name: "general-purpose" enabled: true requirements: instanceTypes: - "bx2-2x8" - "bx2-4x16" capacityType: "on-demand" architecture: "amd64" zones: - "us-south-1" - "us-south-2" limits: cpu: 100 memory: 100Gi disruption: consolidationPolicy: WhenEmpty consolidateAfter: 30s expireAfter: 720h labels: nodepool: "general-purpose" workload-type: "general" # Spot instances pool - name: "spot-instances" enabled: true requirements: instanceTypes: - "bx2-4x16" - "bx2-8x32" capacityType: "spot" architecture: "amd64" limits: cpu: 200 memory: 200Gi disruption: consolidationPolicy: WhenUnderutilized consolidateAfter: 10s expireAfter: 12h taints: - key: "node.kubernetes.io/instance-type" value: "spot" effect: "NoSchedule" labels: nodepool: "spot-instances" workload-type: "batch" # GPU pool for AI/ML - name: "gpu-pool" enabled: false # Enable when needed requirements: instanceTypes: - "gx2-8x64x1v100" - "gx2-16x128x2v100" capacityType: "on-demand" architecture: "amd64" limits: cpu: 20 memory: 100Gi disruption: consolidationPolicy: WhenEmpty consolidateAfter: 300s expireAfter: 24h taints: - key: "nvidia.com/gpu" value: "true" effect: "NoSchedule" labels: nodepool: "gpu-pool" accelerator: "nvidia-v100" ``` ## Deployment Examples ### VPC Deployment ```bash helm install karpenter-ibm ./charts \ --set ibmApiKey="YOUR_API_KEY" \ --set region="us-south" \ --set vpcApiKey="YOUR_VPC_API_KEY" \ --set customResources.mode="vpc" \ --set customResources.nodeClass.vpc.vpcId="r010-your-vpc-id" \ --set clusterName="my-vpc-cluster" ``` ### IKS Deployment ```bash helm install karpenter-ibm ./charts \ --set ibmApiKey="YOUR_API_KEY" \ --set region="us-south" \ --set vpcApiKey="YOUR_VPC_API_KEY" \ --set customResources.mode="iks" \ --set iksClusterID="your-cluster-id" \ --set clusterName="my-iks-cluster" ``` ### Using Values Files ```bash # VPC deployment helm install karpenter-ibm ./charts -f charts/examples/vpc-example-values.yaml # IKS deployment helm install karpenter-ibm ./charts -f charts/examples/iks-example-values.yaml ``` ## Configuration Reference ### Required Values by Mode #### VPC Mode Requirements - `customResources.nodeClass.vpc.vpcId` - Your VPC ID - `ibmApiKey` - IBM Cloud API key - `vpcApiKey` - VPC API key - `region` - IBM Cloud region #### IKS Mode Requirements - `iksClusterID` - IKS cluster ID - `ibmApiKey` - IBM Cloud API key - `region` - IBM Cloud region ### Instance Type Reference #### General Purpose - `bx2-2x8` - 2 vCPU, 8 GB RAM - `bx2-4x16` - 4 vCPU, 16 GB RAM - `bx2-8x32` - 8 vCPU, 32 GB RAM #### Memory Optimized - `mx2-8x64` - 8 vCPU, 64 GB RAM - `mx2-16x128` - 16 vCPU, 128 GB RAM #### GPU Instances - `gx2-8x64x1v100` - 8 vCPU, 64 GB RAM, 1x V100 - `gx2-16x128x2v100` - 16 vCPU, 128 GB RAM, 2x V100 ### Capacity Types - `on-demand` - Regular pricing, guaranteed availability - `spot` - Discounted pricing, may be interrupted ### Consolidation Policies - `WhenEmpty` - Consolidate when nodes become empty - `WhenUnderutilized` - Consolidate when nodes are underutilized ## Validation The chart includes comprehensive validation: ```yaml # This will fail validation customResources: enabled: true mode: "vpc" # Missing: nodeClass.vpc.vpcId # This will fail validation customResources: enabled: true mode: "iks" # Missing: iksClusterID at root level ``` ## Monitoring and Observability Enable monitoring for your custom resources: ```yaml metrics: serviceMonitor: enabled: true additionalLabels: monitoring: "prometheus" interval: 30s ``` ## Troubleshooting ### Common Issues 1. **Missing VPC ID**: Ensure `customResources.nodeClass.vpc.vpcId` is set for VPC mode 2. **Missing IKS Cluster ID**: Ensure `iksClusterID` is set at root level for IKS mode 3. **Invalid Instance Types**: Verify instance types are available in your region 4. **Subnet Selection**: Ensure subnets exist and have proper tags for discovery ### Debug Configuration Check generated resources: ```bash # View generated NodePools kubectl get nodepools # View generated IBMNodeClass kubectl get ibmnodeclasses # Check configuration kubectl get configmap karpenter-ibm-config -o yaml ``` ## Migration Guide ### From Manual CR Management 1. **Backup existing resources**: ```bash kubectl get nodepools -o yaml > existing-nodepools.yaml kubectl get ibmnodeclasses -o yaml > existing-nodeclasses.yaml ``` 2. **Configure chart values** to match existing resources 3. **Enable CR management**: ```yaml customResources: enabled: true ``` 4. **Deploy and verify** resources are correctly managed ### From Previous Chart Versions 1. **Update values.yaml** with new `customResources` section 2. **Set mode** based on your deployment type 3. **Configure NodePools** to match your requirements 4. **Upgrade chart** with new configuration ## Best Practices 1. **Start with default pools** and enable additional pools as needed 2. **Use spot instances** for batch and fault-tolerant workloads 3. **Configure appropriate taints** for specialized workloads 4. **Set reasonable resource limits** to control costs 5. **Use discovery tags** for automatic resource selection 6. **Monitor resource usage** and adjust limits accordingly ## Related Documentation - [VPC Integration Guide](vpc-integration.md) - [IKS Integration Guide](iks-integration.md) - [Troubleshooting Guide](troubleshooting.md) - [Port Conflict Analysis](troubleshooting/port-conflict-analysis.md)