Skip to Content
DeploymentSelf-Hosted

Self-Hosted Deployment

TL;DR

Self-host CMDOP by running the open-source server stack on your own infrastructure. It is a multi-process Python stack (a gRPC relay, a REST control plane, and a background worker) sharing one PostgreSQL and one Redis — brought up with Docker Compose. Configure it with CMDOP_* environment variables, terminate TLS at your own reverse proxy, and point agents and SDKs at your server. This is the shipping deployment path and it is fully functional and free.

Run the whole CMDOP mesh on your own network — air-gapped if you want. The relay is open source, so self-hosting gets you the full product with no enforced caps.

The self-hosted server is not a single binary. It is several small Python processes that share Postgres and Redis. Docker Compose runs them together; you do not start them by hand. There is no cmdop-server binary and no single config.yaml — configuration is by environment variables.

What are the prerequisites?

  • A Linux host (Ubuntu 22.04+ recommended)
  • Docker 24+ and Docker Compose v2 (docker compose version must work)
  • PostgreSQL 16 and Redis 7 — the OSS Compose file ships both as containers, or point at managed instances
  • A reverse proxy for TLS (Caddy / Traefik / Nginx) in front of the stack for any public deployment

What processes make up the stack?

ProcessPortRole
grpc_server50051The live relay — the bidi agent stream that routes terminal I/O, files, and tunnels
api_server8000REST control plane — auth, fleets, members, schedules, tunnels, API keys, session metadata; serves /metrics and /healthz
workerarq queue — cleanup crons and the schedule executor (talks to Redis, no port)
postgres5432Shared state, RLS-enforced
redis6379Token cache, brute-force counters, arq queue, tunnel registry, PTY output buffers

The full product can also run a tunnel_server (port-forwarding edge), a ws_gateway (browser realtime push), and an optional jarvis_server (the server-side AI agent). The minimal OSS stack above is enough to register machines and run sessions.

How do I get started quickly?

# Clone the open-source server git clone https://github.com/cmdop/cmdop-server cd cmdop-server # Copy the env template and fill in the two required secrets cp .env.example .env

Set strong values for the two required secrets in .env:

# Required — generate strong values POSTGRES_PASSWORD=$(openssl rand -hex 16) INTERNAL_SECRET=$(openssl rand -hex 32)

Bring the stack up. A one-shot migrate container runs alembic upgrade head before api, grpc, and worker start:

docker compose -f deploy/compose.oss.yml up --build

Once you see api_server.bind in the logs, create the first user:

docker compose -f deploy/compose.oss.yml exec api \ cmdop-admin user create --email [email protected]

The command prints a bootstrap API key. Save it now — it is shown only once.

How do I configure the server?

All server settings are environment variables read by core.config.settings.Settings (pydantic-settings), prefixed CMDOP_. .env is auto-loaded if present. The most important ones:

# Database — the RLS-applying app role CMDOP_DATABASE_URL=postgresql+asyncpg://cmdop_app:cmdop@postgres:5432/cmdop # BYPASSRLS role (alembic, auth lookups, cleanup, health probe) CMDOP_ADMIN_DATABASE_URL=postgresql+asyncpg://cmdop_admin:cmdop@postgres:5432/cmdop CMDOP_DB_MODE=standalone # alembic owns DDL for OSS / fresh installs # Redis CMDOP_REDIS_URL=redis://redis:6379/0 # Network binds CMDOP_REST_HOST=0.0.0.0 CMDOP_REST_PORT=8000 CMDOP_GRPC_HOST=0.0.0.0 CMDOP_GRPC_PORT=50051 # Auth — loopback admin pipe secret (required) CMDOP_INTERNAL_SECRET=... # openssl rand -hex 32 # Observability CMDOP_ENVIRONMENT=oss CMDOP_LOG_JSON=true

The OSS Compose file additionally reads two docker-compose substitution variables from .env (these are not Settings fields): POSTGRES_PASSWORD and INTERNAL_SECRET, both required. API_PORT and GRPC_PORT override the published host ports (default 8000 / 50051).

See the open-source server’s docs/configuration.md for every CMDOP_* field (SMTP for OTP delivery, Centrifugo fan-out, OpenTelemetry tracing, worker tuning, shutdown grace).

How do I run migrations?

Migrations are handled by the migrate sidecar, which runs alembic upgrade head on every up. To run it explicitly:

docker compose -f deploy/compose.oss.yml up -d migrate

After every upgrade, verify multi-tenant isolation is intact:

just audit-rls

If audit-rls exits non-zero an expected row-level-security policy is missing — do not run that build in production.

How do I set up TLS?

The OSS Compose ships plain HTTP / h2c by design. Terminate TLS at a reverse proxy in front of the stack — gRPC reaches grpc_server over h2c:// on the upstream while the proxy handles the public TLS:

client ─► Caddy / Traefik (TLS) ─► api_server :8000 (HTTP) ─► grpc_server :50051 (h2c)

Traefik example (file provider)

http: routers: api: rule: "Host(`api.example.com`)" service: api tls: {certResolver: letsencrypt} grpc: rule: "Host(`grpc.example.com`)" service: grpc tls: {certResolver: letsencrypt} services: api: loadBalancer: servers: - url: "http://api:8000" grpc: loadBalancer: servers: - url: "h2c://grpc:50051" passHostHeader: true

Mutual-TLS between the relay and agents is not built in — the operator terminates TLS at the proxy. The /metrics endpoint on :8000 is unauthenticated by design (Prometheus pull model); keep it off the public internet.

How do I configure agents to connect?

Point agents at your gRPC endpoint (the public host your TLS proxy fronts):

export CMDOP_SERVER_ADDRESS=grpc.example.com:443 cmdop connect

How do I configure the SDK to use my server?

from cmdop import AsyncCMDOPClient # Point the SDK at your self-hosted relay instead of the managed cloud client = AsyncCMDOPClient.remote( api_key="cmdop_apikey_...", server="grpc.example.com:443", )

How do I check server health?

# REST liveness / readiness (returns db + redis status) curl https://api.example.com/healthz/live curl https://api.example.com/healthz/ready # → {"ok":true,"db":true,"redis":true} # gRPC health (the gRPC Health-Checking protocol is registered) grpc_health_probe -addr=grpc.example.com:443

How do I back up the database?

docker compose -f deploy/compose.oss.yml exec postgres \ pg_dump -U cmdop_admin cmdop > backup.sql

For production, use your Postgres provider’s point-in-time recovery rather than rolling your own. Redis state is recoverable — agents reconnect and counters reset cleanly — so a Redis backup is optional.

How do I upgrade to a new version?

# Pull / rebuild images docker compose -f deploy/compose.oss.yml pull docker compose -f deploy/compose.oss.yml build # Apply migrations (the migrate sidecar runs on every up) docker compose -f deploy/compose.oss.yml up -d migrate docker compose -f deploy/compose.oss.yml up -d # Verify RLS coverage just audit-rls

How do I scale beyond one node?

The codebase ships single-node-first and stays that way until your numbers say otherwise:

  • L1 — single node (≤ ~50 agents): the OSS Compose file as-is.
  • L2 — multiple api / grpc replicas behind a reverse proxy (≤ ~5,000 agents): move Postgres to a managed provider, add PgBouncer in transaction mode (RLS uses SET LOCAL, which transaction mode supports).
  • L3 — Kubernetes (≤ ~50,000 agents): managed Postgres, Redis cluster, autoscale on active gRPC streams rather than CPU. See Kubernetes.

How do I troubleshoot common issues?

# Service status and logs docker compose -f deploy/compose.oss.yml ps docker compose -f deploy/compose.oss.yml logs -f grpc # Database connectivity docker compose -f deploy/compose.oss.yml exec postgres \ psql -U cmdop_admin -d cmdop -c "SELECT 1" # Redis connectivity docker compose -f deploy/compose.oss.yml exec redis redis-cli ping

See Self-Hosted troubleshooting for connection, certificate, and upgrade-failure fixes.

Last updated on