Scale Nodes
Scaling allows your KubeRaya cluster to adapt to workload demand by adjusting the number of nodes available to run applications.
CloudRaya provides controlled and predictable scaling mechanisms that balance flexibility, stability, and cost awareness without exposing unnecessary infrastructure complexity.
This page explains how node scaling works in KubeRaya, when to use each scaling mode, and the operational constraints you should understand before scaling.
Scaling Overview
In KubeRaya, scaling applies to cluster infrastructure, not individual workloads.
You can scale:
- Worker nodes – to increase or decrease workload capacity
- Master (control plane) nodes – only when High Availability (HA) is enabled
Scaling does not automatically modify your Kubernetes deployments or pods.
It only changes the available compute capacity.
Node Types and Responsibilities
Before scaling, it’s important to understand node roles.
Master Node (Control Plane)
- Runs Kubernetes control plane components
- Manages cluster state, scheduling, and API access
- Does not run application workloads
Worker Node
- Runs application pods and services
- Provides CPU, memory, and storage for workloads
- Primary target for scaling operations
Manual Scaling
Manual scaling allows you to explicitly set the number of worker nodes.
How Manual Scaling Works
- You define a fixed number of worker nodes
- The cluster maintains that exact node count
- No automatic scaling occurs based on load
When to Use Manual Scaling
- Predictable or stable workloads
- Development and testing environments
- Cost-sensitive workloads with known capacity needs
- Environments where scaling must be tightly controlled
Manual scaling provides maximum predictability at the cost of flexibility.
Auto Scaling
Auto scaling allows KubeRaya to dynamically adjust the number of worker nodes within a defined range.
Key Characteristics
- Applies to worker nodes only
- Master nodes are not auto-scaled
- Scaling occurs within user-defined minimum and maximum limits
- Existing nodes are preserved when switching modes
When to Use Auto Scaling
- Variable or bursty workloads
- Production environments with fluctuating traffic
- Applications with unpredictable demand patterns
Auto scaling improves availability and performance while reducing manual intervention.
Switching Scaling Behavior
KubeRaya allows switching between Manual Scale and Auto Scale modes.
Manual → Auto Scale
When switching to auto scale:
- Existing nodes are preserved
- You define:
- Minimum worker nodes
- Maximum worker nodes
- Kubernetes workloads continue running without disruption
Auto Scale → Manual
When switching back to manual scale:
- Existing nodes are preserved
- Auto scaling is disabled
- The cluster stabilizes at the current worker node count
Switching modes does not restart the cluster and does not affect running workloads.
Limits & Constraints
To ensure stability and predictable performance, KubeRaya enforces the following constraints:
Maximum Node Limit
- Maximum total nodes per cluster: 5
- This limit includes:
- Master nodes
- Worker nodes
Master Node Constraints
- Master node count is fixed by default
- Master nodes can only be adjusted when:
- High Availability (HA) is enabled
- Without HA:
- Master node count remains at 1
- Cannot be scaled independently
Auto Scaling Scope
- Auto scaling applies to worker nodes only
- Master nodes are never auto-scaled
- Worker nodes scale within defined min/max limits
These constraints are designed to protect control plane stability.
Cost Awareness
Scaling directly affects cluster cost.
Cost Behavior
- Each additional node increases hourly cost
- Auto scaling may increase cost during peak usage
- Manual scaling provides fixed cost predictability
Cost Visibility
Before confirming a scaling change, CloudRaya displays:
- Estimated hourly cost
- Estimated monthly cost
This allows you to make informed scaling decisions.
Best Practices
To scale safely and efficiently:
- Start with manual scaling for new clusters
- Enable auto scaling only when workload patterns are understood
- Keep master nodes minimal unless HA is required
- Monitor workload resource usage before increasing node count
- Treat scaling as an infrastructure decision, not a performance shortcut
Scaling should be intentional, measured, and workload-driven.