Installing Helix on Kubernetes with Helm

This page describes how to install Helix on Kubernetes.

Requirements

  • Control Plane is the Helix API, web interface, and postgres database and requires:

    • Linux, macOS or Windows
    • Docker
    • 4 CPUs, 8GB RAM and 50GB+ free disk space
  • Inference Provider requires one of:

    • An NVIDIA GPU if you want to use private Helix Runners (example), or
    • Ollama running locally on macOS, Linux or Windows (example), or
    • An OpenAI-compatible API provider, such as TogetherAI (example) - we like TogetherAI because you can run the same open source models via their API that you can run locally using Helix GPU Runners, but you can use any OpenAI-compatible API (e.g. vLLM, Azure OpenAI, Gemini etc)
  • Private Helix Runners require:

    • As much system memory as you have GPU memory
    • Min 8GB GPU for small models (Llama3-8B, Phi3-Mini), 24GB for Mixtral/SDXL, 40GB for Llama3-70B
    • Min 24GB GPU for fine-tuning (text or image)
    • Recommend 2x24GB GPUs for e.g. text & image inference in parallel
    • NVIDIA 3090s, A6000s are typically good price/performance
    • 100GB+ of free disk space
  • A fast internet connection (llamaindex container is about 11GB, small runner image is 23GB)

Deploying the Control Plane

This section details how to install the Helix control plane.

1. Install Keycloak

Helix uses Keycloak for authentication. If you have one already, you can skip this step. Otherwise, to install one through Helm (chart info, repo).

For example:

helm upgrade --install keycloak oci://registry-1.docker.io/bitnamicharts/keycloak \
  --set auth.adminUser=admin \
  --set auth.adminPassword=oh-hallo-insecure-password \
  --set httpRelativePath="/auth/"

By default it only has ClusterIP service, in order to expose it, you can either port-forward or create a load balancer to access it if you are on k3s or minikube:

kubectl expose pod keycloak-0 --port 8888 --target-port 8080 --name keycloak-ext --type=LoadBalancer

Alternatively, if you run on k3s:

helm upgrade --install keycloak oci://registry-1.docker.io/bitnamicharts/keycloak \
  --set auth.adminUser=admin \
  --set auth.adminPassword=oh-hallo-insecure-password \
  --set httpRelativePath="/auth/" \
  --set service.type=LoadBalancer \
  --set service.ports.http=8888

2. Install the Helm Repository

helm repo add helix https://charts.helix.ml 
helm repo update

3. Apply the Chart

Copy the values-example.yaml to values-your-env.yaml and update the values as needed. Then run the following command (just with your own file):

export LATEST_RELEASE=$(curl -s https://get.helix.ml/repos/helixml/helix/releases/latest | sed -n 's/.*"tag_name": "\(.*\)".*/\1/p')
helm upgrade --install my-helix-controlplane helix/helix-controlplane \
  -f helix-controlplane/values.yaml \
  -f helix-controlplane/values-example.yaml \
  --set image.tag="${LATEST_RELEASE}"

Use port-forward to access the service.

Deploying a Runner

This section describes how to install a Helix runner on Kubernetes.

1. Install the Helm Repository

helm repo add helix https://charts.helix.ml 
helm repo update

2. Apply the Chart

Then, install the runner:

helm upgrade --install my-helix-runner helix/helix-runner \
  --set runner.host="<host>" \
  --set runner.token="<token>" \
  --set runner.memory=24GB \
  --set replicaCount=4 \
  --set nodeSelector."nvidia\.com/gpu\.product"="NVIDIA-GeForce-RTX-3090-Ti"
Last updated on