Helm-only Install¶
This guide covers installing InferenceHub directly with Helm, without the inferencehub CLI. This is the recommended approach for GitOps workflows (ArgoCD, Flux) or any environment where you want declarative, version-controlled Helm values rather than a CLI-driven install.
When to use Helm-only vs CLI¶
| CLI install | Helm-only | |
|---|---|---|
| Recommended for | First-time setup, local dev, quick deploys | GitOps, ArgoCD, Flux, CI/CD pipelines |
| Values generation | Automatic (CLI generates all required overrides) | Manual (you provide values.yaml) |
| Secret wiring | Automatic | Manual (follow this guide) |
| Upgrade | inferencehub upgrade |
helm upgrade |
Prerequisites¶
Before installing, run the prerequisites script to install the required cluster components (Gateway API CRDs, cert-manager, Envoy Gateway, and the AWS Load Balancer Controller on EKS):
python3 scripts/setup-prerequisites.py \
--cluster-name my-cluster \
--domain inference.example.com \
--tls-email admin@example.com \
--aws-lb-role-arn arn:aws:iam::123456789012:role/AWSLoadBalancerControllerRole
Point your domain's DNS to the NLB address before proceeding:
kubectl get gateway inferencehub-gateway -n envoy-gateway-system \
-o jsonpath='{.status.addresses[0].value}'
Installation¶
Install InferenceHub from the OCI registry without cloning the repository:
helm install inferencehub oci://ghcr.io/vinay-venkatesh/inferencehub \
--version 0.2.1 \
--namespace inferencehub \
--create-namespace \
-f values.yaml
Required values¶
The CLI auto-generates the following wiring values. When installing with Helm directly, you must provide them manually in your values.yaml.
Networking¶
networking:
gatewayAPI:
hostname: inference.example.com
gatewayRef:
name: inferencehub-gateway
namespace: envoy-gateway-system
tls:
issuerRef: letsencrypt-prod # or letsencrypt-staging
LiteLLM wiring¶
litellm:
# Master key — must match the LITELLM_MASTER_KEY env var in the litellm-env secret
masterKey: "sk-your-master-key-here"
masterkeySecretName: inferencehub-litellm-secret
masterkeySecretKey: master-key
# Disable bundled database (use InferenceHub's PostgreSQL)
db:
deployStandalone: false
useExisting: true
endpoint: inferencehub-postgresql
database: litellm
secret:
name: inferencehub-postgresql-secret
usernameKey: postgres-user
passwordKey: postgres-password
# Disable bundled Redis (InferenceHub wires Redis via environmentSecrets)
redis:
enabled: false
# Required wiring secret that provides DATABASE_URL and REDIS_* to LiteLLM
environmentSecrets:
- inferencehub-litellm-env
# Model routing configuration
proxy_config:
model_list:
- model_name: claude-3-sonnet
litellm_params:
model: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
aws_region_name: us-east-1
general_settings:
master_key: "os.environ/LITELLM_MASTER_KEY"
OpenWebUI wiring¶
openwebui:
# Always disable — InferenceHub routes through LiteLLM, not Ollama subchart
ollama:
enabled: false
# Always disable — InferenceHub provides a dedicated Redis for websockets
websocket:
redis:
enabled: false
url: redis://inferencehub-redis-openwebui:6379/0
# Point OpenWebUI at the LiteLLM gateway
openaiBaseApiUrl: http://inferencehub-litellm:4000/v1
# Wire DATABASE_URL and OPENAI_API_KEY from secrets
extraEnvVars:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: inferencehub-postgresql-secret
key: openwebui-database-url
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: inferencehub-litellm-secret
key: master-key
Example values.yaml¶
Complete working example for AWS EKS with Bedrock models and IRSA:
networking:
gatewayAPI:
hostname: inference.example.com
gatewayRef:
name: inferencehub-gateway
namespace: envoy-gateway-system
tls:
issuerRef: letsencrypt-prod
litellm:
masterKey: "sk-your-master-key-here"
masterkeySecretName: inferencehub-litellm-secret
masterkeySecretKey: master-key
db:
deployStandalone: false
useExisting: true
endpoint: inferencehub-postgresql
database: litellm
secret:
name: inferencehub-postgresql-secret
usernameKey: postgres-user
passwordKey: postgres-password
redis:
enabled: false
environmentSecrets:
- inferencehub-litellm-env
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/LiteLLMBedrockRole
proxy_config:
model_list:
- model_name: claude-3-sonnet
litellm_params:
model: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
aws_region_name: us-east-1
- model_name: claude-3-haiku
litellm_params:
model: bedrock/anthropic.claude-3-haiku-20240307-v1:0
aws_region_name: us-east-1
general_settings:
master_key: "os.environ/LITELLM_MASTER_KEY"
openwebui:
ollama:
enabled: false
websocket:
redis:
enabled: false
url: redis://inferencehub-redis-openwebui:6379/0
openaiBaseApiUrl: http://inferencehub-litellm:4000/v1
extraEnvVars:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: inferencehub-postgresql-secret
key: openwebui-database-url
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: inferencehub-litellm-secret
key: master-key
postgresql:
auth:
password: "your-postgres-password"
ArgoCD Application¶
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: inferencehub
namespace: argocd
spec:
project: default
source:
repoURL: ghcr.io/vinay-venkatesh
chart: inferencehub
targetRevision: 0.2.1
helm:
valueFiles:
- values.yaml
destination:
server: https://kubernetes.default.svc
namespace: inferencehub
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
Flux HelmRelease¶
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: inferencehub
namespace: flux-system
spec:
type: oci
interval: 12h
url: oci://ghcr.io/vinay-venkatesh
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: inferencehub
namespace: inferencehub
spec:
interval: 1h
chart:
spec:
chart: inferencehub
version: 0.2.1
sourceRef:
kind: HelmRepository
name: inferencehub
namespace: flux-system
values:
networking:
gatewayAPI:
hostname: inference.example.com
gatewayRef:
name: inferencehub-gateway
namespace: envoy-gateway-system
tls:
issuerRef: letsencrypt-prod
# ... remaining values as shown in the example above
Upgrading¶
To upgrade to a new chart version:
helm upgrade inferencehub oci://ghcr.io/vinay-venkatesh/inferencehub \
--version 0.2.1 \
--namespace inferencehub \
-f values.yaml
Check the CHANGELOG before upgrading. Breaking changes (if any) are listed there along with migration steps.