Kubernetes Deployment
Deploy the open-source CMDOP server on Kubernetes for scale. Run the control plane (api_server, :8000) and the relay (grpc_server, :50051) as separate Deployments backed by managed PostgreSQL and Redis. Expose REST over an Ingress with TLS via cert-manager, and the gRPC relay over a gRPC-aware Ingress. Use /healthz/live and /healthz/ready probes, and autoscale the relay on active gRPC streams rather than CPU.
Kubernetes is the scale tier (L3 in the scale ladder — up to ~50,000 agents). The server is a multi-process Python stack, so run each process as its own Deployment rather than as a single container.
There is no published Helm chart yet. Deploy with the manual manifests below, or adapt the OSS deploy/compose.oss.yml with a Compose-to-Kubernetes tool. At this scale, use managed Postgres and Redis rather than in-cluster single instances.
What are the prerequisites?
- Kubernetes 1.24+
kubectlconfigured- A managed PostgreSQL 16 and Redis 7 (recommended over in-cluster) or operators such as CloudNativePG
- cert-manager (for TLS) and an Ingress controller with gRPC support
How do I create the namespace?
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cmdopHow do I configure secrets?
# secrets.yaml
apiVersion: v1
kind: Secret
metadata:
name: cmdop-secrets
namespace: cmdop
type: Opaque
stringData:
CMDOP_DATABASE_URL: postgresql+asyncpg://cmdop_app:PASS@postgres:5432/cmdop
CMDOP_ADMIN_DATABASE_URL: postgresql+asyncpg://cmdop_admin:PASS@postgres:5432/cmdop
CMDOP_REDIS_URL: redis://redis:6379/0
CMDOP_INTERNAL_SECRET: "<openssl rand -hex 32>"How do I configure non-secret settings?
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cmdop-config
namespace: cmdop
data:
CMDOP_DB_MODE: standalone
CMDOP_ENVIRONMENT: prod
CMDOP_REST_HOST: "0.0.0.0"
CMDOP_REST_PORT: "8000"
CMDOP_GRPC_HOST: "0.0.0.0"
CMDOP_GRPC_PORT: "50051"
CMDOP_LOG_JSON: "true"All processes read the same CMDOP_* environment. Mount both the ConfigMap and the Secret via envFrom so each Deployment shares the configuration.
How do I run migrations?
Run alembic upgrade head as a Job (or an init container) before rolling out the Deployments, using the same image as api_server:
# migrate-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: cmdop-migrate
namespace: cmdop
spec:
template:
spec:
restartPolicy: Never
containers:
- name: migrate
image: ghcr.io/cmdop/cmdop-api:latest
command: ["alembic", "upgrade", "head"]
envFrom:
- configMapRef: {name: cmdop-config}
- secretRef: {name: cmdop-secrets}What do the Deployments look like?
Run the control plane and the relay as separate Deployments.
# api-deployment.yaml — REST control plane
apiVersion: apps/v1
kind: Deployment
metadata:
name: cmdop-api
namespace: cmdop
spec:
replicas: 2
selector:
matchLabels: {app: cmdop-api}
template:
metadata:
labels: {app: cmdop-api}
spec:
containers:
- name: api
image: ghcr.io/cmdop/cmdop-api:latest
ports:
- {containerPort: 8000, name: http}
envFrom:
- configMapRef: {name: cmdop-config}
- secretRef: {name: cmdop-secrets}
livenessProbe:
httpGet: {path: /healthz/live, port: 8000}
periodSeconds: 10
readinessProbe:
httpGet: {path: /healthz/ready, port: 8000}
periodSeconds: 5
resources:
requests: {cpu: 500m, memory: 1Gi}
limits: {cpu: "2", memory: 4Gi}# grpc-deployment.yaml — the relay
apiVersion: apps/v1
kind: Deployment
metadata:
name: cmdop-grpc
namespace: cmdop
spec:
replicas: 2
selector:
matchLabels: {app: cmdop-grpc}
template:
metadata:
labels: {app: cmdop-grpc}
spec:
containers:
- name: grpc
image: ghcr.io/cmdop/cmdop-grpc:latest
ports:
- {containerPort: 50051, name: grpc}
envFrom:
- configMapRef: {name: cmdop-config}
- secretRef: {name: cmdop-secrets}
resources:
requests: {cpu: 500m, memory: 1Gi}
limits: {cpu: "2", memory: 4Gi}For grpc_server use grpc_health_probe (the gRPC Health-Checking protocol is registered) for liveness/readiness rather than an HTTP probe. Run the worker as a third Deployment with no Service and no ports.
How do I expose the services?
# services.yaml
apiVersion: v1
kind: Service
metadata: {name: cmdop-api, namespace: cmdop}
spec:
selector: {app: cmdop-api}
ports:
- {name: http, port: 80, targetPort: 8000}
---
apiVersion: v1
kind: Service
metadata: {name: cmdop-grpc, namespace: cmdop}
spec:
selector: {app: cmdop-grpc}
ports:
- {name: grpc, port: 50051, targetPort: 50051}How do I configure Ingress with TLS?
REST control plane:
# api-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: cmdop-api
namespace: cmdop
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- {hosts: [api.example.com], secretName: cmdop-api-tls}
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend: {service: {name: cmdop-api, port: {number: 80}}}gRPC relay (needs a gRPC-aware backend protocol):
# grpc-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: cmdop-grpc
namespace: cmdop
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/backend-protocol: GRPC
spec:
ingressClassName: nginx
tls:
- {hosts: [grpc.example.com], secretName: cmdop-grpc-tls}
rules:
- host: grpc.example.com
http:
paths:
- path: /
pathType: Prefix
backend: {service: {name: cmdop-grpc, port: {number: 50051}}}How do I deploy PostgreSQL and Redis?
Prefer managed offerings (RDS / Cloud SQL / Aiven for Postgres; ElastiCache / Memorystore / Upstash for Redis). If you must run them in-cluster, CloudNativePG and a Redis operator are reasonable choices. Point CMDOP_DATABASE_URL, CMDOP_ADMIN_DATABASE_URL, and CMDOP_REDIS_URL at them. With PgBouncer, use transaction mode (RLS uses SET LOCAL, which transaction mode supports).
How do I configure autoscaling?
Scale the relay on active gRPC streams, not CPU — a relay can hold thousands of mostly-idle streams while CPU stays low. Export the stream-duration metric and drive an HPA (via Prometheus Adapter / KEDA) off the rate of active streams:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata: {name: cmdop-grpc, namespace: cmdop}
spec:
scaleTargetRef: {apiVersion: apps/v1, kind: Deployment, name: cmdop-grpc}
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metric: {name: cmdop_grpc_active_streams}
target: {type: AverageValue, averageValue: "500"}How do I apply everything?
kubectl apply -f namespace.yaml
kubectl apply -f secrets.yaml
kubectl apply -f configmap.yaml
kubectl apply -f migrate-job.yaml
kubectl wait --for=condition=complete job/cmdop-migrate -n cmdop
kubectl apply -f api-deployment.yaml -f grpc-deployment.yaml -f services.yaml
kubectl apply -f api-ingress.yaml -f grpc-ingress.yamlHow do I verify the deployment?
kubectl get pods -n cmdop
kubectl get svc,ingress -n cmdop
kubectl logs -n cmdop -l app=cmdop-grpc -fHow do I set up monitoring?
The api_server serves /metrics on :8000. Scrape it with a ServiceMonitor (keep it off the public Ingress):
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata: {name: cmdop-api, namespace: cmdop}
spec:
selector:
matchLabels: {app: cmdop-api}
endpoints:
- {port: http, path: /metrics}After every rollout, run just audit-rls against the database to confirm RLS coverage.
How do I troubleshoot Kubernetes issues?
kubectl describe pod -n cmdop -l app=cmdop-grpc
kubectl logs -n cmdop -l app=cmdop-grpc --previous
kubectl exec -it -n cmdop deploy/cmdop-api -- shWhat should I read next?
- Docker — the simpler single-node Compose path
- Self-Hosted — config, TLS, and the scale ladder