CloudRaya Documentation

Scale Nodes

Scaling allows your KubeRaya cluster to adapt to workload demand by adjusting the number of nodes available to run applications.

CloudRaya provides controlled and predictable scaling mechanisms that balance flexibility, stability, and cost awareness without exposing unnecessary infrastructure complexity.

This page explains how node scaling works in KubeRaya, when to use each scaling mode, and the operational constraints you should understand before scaling.

Scaling Overview

In KubeRaya, scaling applies to cluster infrastructure, not individual workloads.

You can scale:

  • Worker nodes – to increase or decrease workload capacity
  • Master (control plane) nodes – only when High Availability (HA) is enabled

Scaling does not automatically modify your Kubernetes deployments or pods.

It only changes the available compute capacity.

Node Types and Responsibilities

Before scaling, it’s important to understand node roles.

Master Node (Control Plane)

  • Runs Kubernetes control plane components
  • Manages cluster state, scheduling, and API access
  • Does not run application workloads

Worker Node

  • Runs application pods and services
  • Provides CPU, memory, and storage for workloads
  • Primary target for scaling operations

Manual Scaling

Manual scaling allows you to explicitly set the number of worker nodes.

How Manual Scaling Works

  • You define a fixed number of worker nodes
  • The cluster maintains that exact node count
  • No automatic scaling occurs based on load

When to Use Manual Scaling

  • Predictable or stable workloads
  • Development and testing environments
  • Cost-sensitive workloads with known capacity needs
  • Environments where scaling must be tightly controlled

Manual scaling provides maximum predictability at the cost of flexibility.

Auto Scaling

Auto scaling allows KubeRaya to dynamically adjust the number of worker nodes within a defined range.

Key Characteristics

  • Applies to worker nodes only
  • Master nodes are not auto-scaled
  • Scaling occurs within user-defined minimum and maximum limits
  • Existing nodes are preserved when switching modes

When to Use Auto Scaling

  • Variable or bursty workloads
  • Production environments with fluctuating traffic
  • Applications with unpredictable demand patterns

Auto scaling improves availability and performance while reducing manual intervention.

Switching Scaling Behavior

KubeRaya allows switching between Manual Scale and Auto Scale modes.

Manual → Auto Scale

When switching to auto scale:

  • Existing nodes are preserved
  • You define:
    • Minimum worker nodes
    • Maximum worker nodes
  • Kubernetes workloads continue running without disruption

Auto Scale → Manual

When switching back to manual scale:

  • Existing nodes are preserved
  • Auto scaling is disabled
  • The cluster stabilizes at the current worker node count

Switching modes does not restart the cluster and does not affect running workloads.

Limits & Constraints

To ensure stability and predictable performance, KubeRaya enforces the following constraints:

Maximum Node Limit

  • Maximum total nodes per cluster: 5
  • This limit includes:
    • Master nodes
    • Worker nodes

Master Node Constraints

  • Master node count is fixed by default
  • Master nodes can only be adjusted when:
    • High Availability (HA) is enabled
  • Without HA:
    • Master node count remains at 1
    • Cannot be scaled independently

Auto Scaling Scope

  • Auto scaling applies to worker nodes only
  • Master nodes are never auto-scaled
  • Worker nodes scale within defined min/max limits

These constraints are designed to protect control plane stability.

Cost Awareness

Scaling directly affects cluster cost.

Cost Behavior

  • Each additional node increases hourly cost
  • Auto scaling may increase cost during peak usage
  • Manual scaling provides fixed cost predictability

Cost Visibility

Before confirming a scaling change, CloudRaya displays:

  • Estimated hourly cost
  • Estimated monthly cost

This allows you to make informed scaling decisions.

Best Practices

To scale safely and efficiently:

  • Start with manual scaling for new clusters
  • Enable auto scaling only when workload patterns are understood
  • Keep master nodes minimal unless HA is required
  • Monitor workload resource usage before increasing node count
  • Treat scaling as an infrastructure decision, not a performance shortcut

Scaling should be intentional, measured, and workload-driven.

📄 Manage KubeRaya Clusters

📄 Create a KubeRaya Cluster

📄 Access a KubeRaya Cluster

📄 Kubernetes Best Practices

© 2026 CloudRaya Product Team. All rights reserved.

On this page