Self-Hosted Deployment
Self-host CMDOP by running the open-source server stack on your own infrastructure. It is a multi-process Python stack (a gRPC relay, a REST control plane, and a background worker) sharing one PostgreSQL and one Redis — brought up with Docker Compose. Configure it with CMDOP_* environment variables, terminate TLS at your own reverse proxy, and point agents and SDKs at your server. This is the shipping deployment path and it is fully functional and free.
Run the whole CMDOP mesh on your own network — air-gapped if you want. The relay is open source, so self-hosting gets you the full product with no enforced caps.
The self-hosted server is not a single binary. It is several small Python processes that share Postgres and Redis. Docker Compose runs them together; you do not start them by hand. There is no cmdop-server binary and no single config.yaml — configuration is by environment variables.
What are the prerequisites?
- A Linux host (Ubuntu 22.04+ recommended)
- Docker 24+ and Docker Compose v2 (
docker compose versionmust work) - PostgreSQL 16 and Redis 7 — the OSS Compose file ships both as containers, or point at managed instances
- A reverse proxy for TLS (Caddy / Traefik / Nginx) in front of the stack for any public deployment
What processes make up the stack?
| Process | Port | Role |
|---|---|---|
grpc_server | 50051 | The live relay — the bidi agent stream that routes terminal I/O, files, and tunnels |
api_server | 8000 | REST control plane — auth, fleets, members, schedules, tunnels, API keys, session metadata; serves /metrics and /healthz |
worker | — | arq queue — cleanup crons and the schedule executor (talks to Redis, no port) |
postgres | 5432 | Shared state, RLS-enforced |
redis | 6379 | Token cache, brute-force counters, arq queue, tunnel registry, PTY output buffers |
The full product can also run a tunnel_server (port-forwarding edge), a ws_gateway (browser realtime push), and an optional jarvis_server (the server-side AI agent). The minimal OSS stack above is enough to register machines and run sessions.
How do I get started quickly?
# Clone the open-source server
git clone https://github.com/cmdop/cmdop-server
cd cmdop-server
# Copy the env template and fill in the two required secrets
cp .env.example .envSet strong values for the two required secrets in .env:
# Required — generate strong values
POSTGRES_PASSWORD=$(openssl rand -hex 16)
INTERNAL_SECRET=$(openssl rand -hex 32)Bring the stack up. A one-shot migrate container runs alembic upgrade head before api, grpc, and worker start:
docker compose -f deploy/compose.oss.yml up --buildOnce you see api_server.bind in the logs, create the first user:
docker compose -f deploy/compose.oss.yml exec api \
cmdop-admin user create --email [email protected]The command prints a bootstrap API key. Save it now — it is shown only once.
How do I configure the server?
All server settings are environment variables read by core.config.settings.Settings (pydantic-settings), prefixed CMDOP_. .env is auto-loaded if present. The most important ones:
# Database — the RLS-applying app role
CMDOP_DATABASE_URL=postgresql+asyncpg://cmdop_app:cmdop@postgres:5432/cmdop
# BYPASSRLS role (alembic, auth lookups, cleanup, health probe)
CMDOP_ADMIN_DATABASE_URL=postgresql+asyncpg://cmdop_admin:cmdop@postgres:5432/cmdop
CMDOP_DB_MODE=standalone # alembic owns DDL for OSS / fresh installs
# Redis
CMDOP_REDIS_URL=redis://redis:6379/0
# Network binds
CMDOP_REST_HOST=0.0.0.0
CMDOP_REST_PORT=8000
CMDOP_GRPC_HOST=0.0.0.0
CMDOP_GRPC_PORT=50051
# Auth — loopback admin pipe secret (required)
CMDOP_INTERNAL_SECRET=... # openssl rand -hex 32
# Observability
CMDOP_ENVIRONMENT=oss
CMDOP_LOG_JSON=trueThe OSS Compose file additionally reads two docker-compose substitution variables from .env (these are not Settings fields): POSTGRES_PASSWORD and INTERNAL_SECRET, both required. API_PORT and GRPC_PORT override the published host ports (default 8000 / 50051).
See the open-source server’s docs/configuration.md for every CMDOP_* field (SMTP for OTP delivery, Centrifugo fan-out, OpenTelemetry tracing, worker tuning, shutdown grace).
How do I run migrations?
Migrations are handled by the migrate sidecar, which runs alembic upgrade head on every up. To run it explicitly:
docker compose -f deploy/compose.oss.yml up -d migrateAfter every upgrade, verify multi-tenant isolation is intact:
just audit-rlsIf audit-rls exits non-zero an expected row-level-security policy is missing — do not run that build in production.
How do I set up TLS?
The OSS Compose ships plain HTTP / h2c by design. Terminate TLS at a reverse proxy in front of the stack — gRPC reaches grpc_server over h2c:// on the upstream while the proxy handles the public TLS:
client ─► Caddy / Traefik (TLS) ─► api_server :8000 (HTTP)
─► grpc_server :50051 (h2c)Traefik example (file provider)
http:
routers:
api:
rule: "Host(`api.example.com`)"
service: api
tls: {certResolver: letsencrypt}
grpc:
rule: "Host(`grpc.example.com`)"
service: grpc
tls: {certResolver: letsencrypt}
services:
api:
loadBalancer:
servers:
- url: "http://api:8000"
grpc:
loadBalancer:
servers:
- url: "h2c://grpc:50051"
passHostHeader: trueMutual-TLS between the relay and agents is not built in — the operator terminates TLS at the proxy. The /metrics endpoint on :8000 is unauthenticated by design (Prometheus pull model); keep it off the public internet.
How do I configure agents to connect?
Point agents at your gRPC endpoint (the public host your TLS proxy fronts):
export CMDOP_SERVER_ADDRESS=grpc.example.com:443
cmdop connectHow do I configure the SDK to use my server?
from cmdop import AsyncCMDOPClient
# Point the SDK at your self-hosted relay instead of the managed cloud
client = AsyncCMDOPClient.remote(
api_key="cmdop_apikey_...",
server="grpc.example.com:443",
)How do I check server health?
# REST liveness / readiness (returns db + redis status)
curl https://api.example.com/healthz/live
curl https://api.example.com/healthz/ready
# → {"ok":true,"db":true,"redis":true}
# gRPC health (the gRPC Health-Checking protocol is registered)
grpc_health_probe -addr=grpc.example.com:443How do I back up the database?
docker compose -f deploy/compose.oss.yml exec postgres \
pg_dump -U cmdop_admin cmdop > backup.sqlFor production, use your Postgres provider’s point-in-time recovery rather than rolling your own. Redis state is recoverable — agents reconnect and counters reset cleanly — so a Redis backup is optional.
How do I upgrade to a new version?
# Pull / rebuild images
docker compose -f deploy/compose.oss.yml pull
docker compose -f deploy/compose.oss.yml build
# Apply migrations (the migrate sidecar runs on every up)
docker compose -f deploy/compose.oss.yml up -d migrate
docker compose -f deploy/compose.oss.yml up -d
# Verify RLS coverage
just audit-rlsHow do I scale beyond one node?
The codebase ships single-node-first and stays that way until your numbers say otherwise:
- L1 — single node (≤ ~50 agents): the OSS Compose file as-is.
- L2 — multiple
api/grpcreplicas behind a reverse proxy (≤ ~5,000 agents): move Postgres to a managed provider, add PgBouncer in transaction mode (RLS usesSET LOCAL, which transaction mode supports). - L3 — Kubernetes (≤ ~50,000 agents): managed Postgres, Redis cluster, autoscale on active gRPC streams rather than CPU. See Kubernetes.
How do I troubleshoot common issues?
# Service status and logs
docker compose -f deploy/compose.oss.yml ps
docker compose -f deploy/compose.oss.yml logs -f grpc
# Database connectivity
docker compose -f deploy/compose.oss.yml exec postgres \
psql -U cmdop_admin -d cmdop -c "SELECT 1"
# Redis connectivity
docker compose -f deploy/compose.oss.yml exec redis redis-cli pingSee Self-Hosted troubleshooting for connection, certificate, and upgrade-failure fixes.
What should I read next?
- Docker — the recommended Compose-based path
- Kubernetes — cluster deployment
- Pricing & editions — self-hosted vs the planned managed cloud