Prerequisites¶

InferenceHub provisioner requires several cluster-wide components that are installed once per cluster. These are separate from the application itself because they are shared across namespaces and must exist before the Helm chart is installed.

What gets installed¶

Component	Purpose	Namespace
Gateway API CRDs	Kubernetes networking primitives	cluster-scoped
cert-manager	Automatic TLS certificate provisioning	`cert-manager`
Let's Encrypt ClusterIssuer	Certificate issuer configuration	cluster-scoped
Envoy Gateway	Implements the Gateway API	`envoy-gateway-system`
GatewayClass	Declares Envoy as the gateway controller	cluster-scoped
Gateway	The actual load balancer / entry point	`envoy-gateway-system`
AWS LB Controller	Creates NLB for the Gateway (AWS only)	`kube-system`

Setup script¶

./scripts/setup-prerequisites.py [OPTIONS]

Required flags¶

Flag	Description
`--cluster-name <name>`	Name of your EKS/GKE/AKS cluster
`--domain <domain>`	Domain where InferenceHub will be served (e.g., `inferencehub.ai`)

Optional flags¶

Flag	Default	Description
`--environment <env>`	`staging`	`production` or `prod` for Let's Encrypt prod issuer; anything else uses staging
`--tls-email <email>`	—	Email for Let's Encrypt certificate notifications
`--aws-lb-role-arn <arn>`	—	IAM role ARN for AWS Load Balancer Controller (AWS only)
`--gateway-name <name>`	`inferencehub-gateway`	Name of the Gateway resource
`--gateway-namespace <ns>`	`envoy-gateway-system`	Namespace for the Gateway resource
`--dry-run`	—	Print what would be installed without making changes
`--skip-gateway-crds`	—	Skip Gateway API CRD installation
`--skip-cert-manager`	—	Skip cert-manager installation
`--skip-cluster-issuer`	—	Skip ClusterIssuer creation
`--skip-envoy-gateway`	—	Skip Envoy Gateway installation
`--skip-gateway-resources`	—	Skip GatewayClass/Gateway/EnvoyProxy creation
`--skip-aws-lb-controller`	—	Skip AWS Load Balancer Controller installation

Examples¶

Staging environment (Let's Encrypt test certs):

./scripts/setup-prerequisites.py \
  --cluster-name my-cluster \
  --domain inferencehub.ai \
  --environment staging \
  --tls-email admin@inferencehub.ai

Production environment:

./scripts/setup-prerequisites.py \
  --cluster-name my-cluster \
  --domain inferencehub.ai \
  --environment production \
  --tls-email admin@inferencehub.ai

AWS EKS with Network Load Balancer:

./scripts/setup-prerequisites.py \
  --cluster-name my-cluster \
  --domain inferencehub.ai \
  --environment staging \
  --tls-email admin@inferencehub.ai \
  --aws-lb-role-arn arn:aws:iam::123456789012:role/aws-load-balancer-controller

Versions installed¶

Component	Version
Gateway API CRDs	v1.2.1
cert-manager	v1.16.2
Envoy Gateway	v1.2.1
AWS LB Controller	v1.11.0 (if requested)

Staging vs production TLS¶

The --environment flag determines which Let's Encrypt issuer is used:

Staging (default): Uses letsencrypt-staging. Certificates are valid but not trusted by browsers. Use this for testing.
Production (--environment production or --environment prod): Uses letsencrypt-prod. Certificates are fully trusted. Rate limits apply — do not use for testing.

The config.yaml environment field must match what you set here. InferenceHub's IssuerType() method derives the ClusterIssuer name from this field.

AWS Load Balancer Controller¶

Required for AWS EKS to provision a Network Load Balancer (NLB) for the Envoy Gateway.

Create the IAM role¶

The controller needs an IAM role with permissions to manage AWS load balancers. Create it using eksctl:

eksctl create iamserviceaccount \
  --cluster=<cluster-name> \
  --namespace=kube-system \
  --name=aws-load-balancer-controller \
  --role-name=aws-load-balancer-controller \
  --attach-policy-arn=arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess \
  --approve

Then pass the role ARN to the script:

./scripts/setup-prerequisites.py \
  --aws-lb-role-arn arn:aws:iam::123456789012:role/aws-load-balancer-controller \
  ...

Configure InferenceHub for AWS¶

Set cloudProvider: aws and aws.litellmRoleArn in inferencehub.yaml. The CLI handles everything else automatically.

# inferencehub.yaml
cloudProvider: aws

aws:
  litellmRoleArn: "arn:aws:iam::123456789012:role/litellm-bedrock-role"

Then install with just:

inferencehub install --config inferencehub.yaml

Note on values-aws.yaml: This file is for AWS-specific Helm overrides that apply to your cluster environment (e.g., gp3 storage classes, AWS-specific node selectors, or specialized service annotations). It is auto-selected by the CLI. Do not edit this file to add credentials or ARNs; those belong in your instance-specific inferencehub.yaml.

Verifying prerequisites¶

After running the script, check that everything is ready:

inferencehub verify

Cleanup¶

To remove all prerequisites:

./scripts/uninstall-prerequisites.py --confirm

This removes (in reverse order): Gateway/GatewayClass/EnvoyProxy, Envoy Gateway, ClusterIssuer, cert-manager, Gateway API CRDs.