Skip to Content
DeploymentKubernetes

Kubernetes Deployment

TL;DR

Deploy the open-source CMDOP server on Kubernetes for scale. Run the control plane (api_server, :8000) and the relay (grpc_server, :50051) as separate Deployments backed by managed PostgreSQL and Redis. Expose REST over an Ingress with TLS via cert-manager, and the gRPC relay over a gRPC-aware Ingress. Use /healthz/live and /healthz/ready probes, and autoscale the relay on active gRPC streams rather than CPU.

Kubernetes is the scale tier (L3 in the scale ladder — up to ~50,000 agents). The server is a multi-process Python stack, so run each process as its own Deployment rather than as a single container.

There is no published Helm chart yet. Deploy with the manual manifests below, or adapt the OSS deploy/compose.oss.yml with a Compose-to-Kubernetes tool. At this scale, use managed Postgres and Redis rather than in-cluster single instances.

What are the prerequisites?

  • Kubernetes 1.24+
  • kubectl configured
  • A managed PostgreSQL 16 and Redis 7 (recommended over in-cluster) or operators such as CloudNativePG
  • cert-manager (for TLS) and an Ingress controller with gRPC support

How do I create the namespace?

# namespace.yaml apiVersion: v1 kind: Namespace metadata: name: cmdop

How do I configure secrets?

# secrets.yaml apiVersion: v1 kind: Secret metadata: name: cmdop-secrets namespace: cmdop type: Opaque stringData: CMDOP_DATABASE_URL: postgresql+asyncpg://cmdop_app:PASS@postgres:5432/cmdop CMDOP_ADMIN_DATABASE_URL: postgresql+asyncpg://cmdop_admin:PASS@postgres:5432/cmdop CMDOP_REDIS_URL: redis://redis:6379/0 CMDOP_INTERNAL_SECRET: "<openssl rand -hex 32>"

How do I configure non-secret settings?

# configmap.yaml apiVersion: v1 kind: ConfigMap metadata: name: cmdop-config namespace: cmdop data: CMDOP_DB_MODE: standalone CMDOP_ENVIRONMENT: prod CMDOP_REST_HOST: "0.0.0.0" CMDOP_REST_PORT: "8000" CMDOP_GRPC_HOST: "0.0.0.0" CMDOP_GRPC_PORT: "50051" CMDOP_LOG_JSON: "true"

All processes read the same CMDOP_* environment. Mount both the ConfigMap and the Secret via envFrom so each Deployment shares the configuration.

How do I run migrations?

Run alembic upgrade head as a Job (or an init container) before rolling out the Deployments, using the same image as api_server:

# migrate-job.yaml apiVersion: batch/v1 kind: Job metadata: name: cmdop-migrate namespace: cmdop spec: template: spec: restartPolicy: Never containers: - name: migrate image: ghcr.io/cmdop/cmdop-api:latest command: ["alembic", "upgrade", "head"] envFrom: - configMapRef: {name: cmdop-config} - secretRef: {name: cmdop-secrets}

What do the Deployments look like?

Run the control plane and the relay as separate Deployments.

# api-deployment.yaml — REST control plane apiVersion: apps/v1 kind: Deployment metadata: name: cmdop-api namespace: cmdop spec: replicas: 2 selector: matchLabels: {app: cmdop-api} template: metadata: labels: {app: cmdop-api} spec: containers: - name: api image: ghcr.io/cmdop/cmdop-api:latest ports: - {containerPort: 8000, name: http} envFrom: - configMapRef: {name: cmdop-config} - secretRef: {name: cmdop-secrets} livenessProbe: httpGet: {path: /healthz/live, port: 8000} periodSeconds: 10 readinessProbe: httpGet: {path: /healthz/ready, port: 8000} periodSeconds: 5 resources: requests: {cpu: 500m, memory: 1Gi} limits: {cpu: "2", memory: 4Gi}
# grpc-deployment.yaml — the relay apiVersion: apps/v1 kind: Deployment metadata: name: cmdop-grpc namespace: cmdop spec: replicas: 2 selector: matchLabels: {app: cmdop-grpc} template: metadata: labels: {app: cmdop-grpc} spec: containers: - name: grpc image: ghcr.io/cmdop/cmdop-grpc:latest ports: - {containerPort: 50051, name: grpc} envFrom: - configMapRef: {name: cmdop-config} - secretRef: {name: cmdop-secrets} resources: requests: {cpu: 500m, memory: 1Gi} limits: {cpu: "2", memory: 4Gi}

For grpc_server use grpc_health_probe (the gRPC Health-Checking protocol is registered) for liveness/readiness rather than an HTTP probe. Run the worker as a third Deployment with no Service and no ports.

How do I expose the services?

# services.yaml apiVersion: v1 kind: Service metadata: {name: cmdop-api, namespace: cmdop} spec: selector: {app: cmdop-api} ports: - {name: http, port: 80, targetPort: 8000} --- apiVersion: v1 kind: Service metadata: {name: cmdop-grpc, namespace: cmdop} spec: selector: {app: cmdop-grpc} ports: - {name: grpc, port: 50051, targetPort: 50051}

How do I configure Ingress with TLS?

REST control plane:

# api-ingress.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: cmdop-api namespace: cmdop annotations: cert-manager.io/cluster-issuer: letsencrypt-prod spec: ingressClassName: nginx tls: - {hosts: [api.example.com], secretName: cmdop-api-tls} rules: - host: api.example.com http: paths: - path: / pathType: Prefix backend: {service: {name: cmdop-api, port: {number: 80}}}

gRPC relay (needs a gRPC-aware backend protocol):

# grpc-ingress.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: cmdop-grpc namespace: cmdop annotations: cert-manager.io/cluster-issuer: letsencrypt-prod nginx.ingress.kubernetes.io/backend-protocol: GRPC spec: ingressClassName: nginx tls: - {hosts: [grpc.example.com], secretName: cmdop-grpc-tls} rules: - host: grpc.example.com http: paths: - path: / pathType: Prefix backend: {service: {name: cmdop-grpc, port: {number: 50051}}}

How do I deploy PostgreSQL and Redis?

Prefer managed offerings (RDS / Cloud SQL / Aiven for Postgres; ElastiCache / Memorystore / Upstash for Redis). If you must run them in-cluster, CloudNativePG and a Redis operator are reasonable choices. Point CMDOP_DATABASE_URL, CMDOP_ADMIN_DATABASE_URL, and CMDOP_REDIS_URL at them. With PgBouncer, use transaction mode (RLS uses SET LOCAL, which transaction mode supports).

How do I configure autoscaling?

Scale the relay on active gRPC streams, not CPU — a relay can hold thousands of mostly-idle streams while CPU stays low. Export the stream-duration metric and drive an HPA (via Prometheus Adapter / KEDA) off the rate of active streams:

apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: {name: cmdop-grpc, namespace: cmdop} spec: scaleTargetRef: {apiVersion: apps/v1, kind: Deployment, name: cmdop-grpc} minReplicas: 2 maxReplicas: 10 metrics: - type: Pods pods: metric: {name: cmdop_grpc_active_streams} target: {type: AverageValue, averageValue: "500"}

How do I apply everything?

kubectl apply -f namespace.yaml kubectl apply -f secrets.yaml kubectl apply -f configmap.yaml kubectl apply -f migrate-job.yaml kubectl wait --for=condition=complete job/cmdop-migrate -n cmdop kubectl apply -f api-deployment.yaml -f grpc-deployment.yaml -f services.yaml kubectl apply -f api-ingress.yaml -f grpc-ingress.yaml

How do I verify the deployment?

kubectl get pods -n cmdop kubectl get svc,ingress -n cmdop kubectl logs -n cmdop -l app=cmdop-grpc -f

How do I set up monitoring?

The api_server serves /metrics on :8000. Scrape it with a ServiceMonitor (keep it off the public Ingress):

apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: {name: cmdop-api, namespace: cmdop} spec: selector: matchLabels: {app: cmdop-api} endpoints: - {port: http, path: /metrics}

After every rollout, run just audit-rls against the database to confirm RLS coverage.

How do I troubleshoot Kubernetes issues?

kubectl describe pod -n cmdop -l app=cmdop-grpc kubectl logs -n cmdop -l app=cmdop-grpc --previous kubectl exec -it -n cmdop deploy/cmdop-api -- sh
  • Docker — the simpler single-node Compose path
  • Self-Hosted — config, TLS, and the scale ladder
Last updated on