Agent Communication

TL;DR

In the execution-state continuity layer, agents on different machines call each other directly through one funnel, the remoteagent client. From a single chat turn an agent can ask_machine (unary), ask_machine_stream (token events), or ask_machines (parallel fan-out) — answers return into the same conversation and audit trail. The relay routes between operators without inbound ports, and the receiver’s permissions.yaml gates each inbound call.

CMDOP agents on different machines can call each other directly. From a single chat turn your laptop’s agent can ask the prod-1 agent to scan logs, then ask db-1 to validate a schema, then aggregate the answers. This is the server-to-server feature.

Why server-to-server matters

The relay sits between every pair of agents in a fleet, so machine-to-machine calls work without inbound ports, port forwarding, or VPNs. The caller’s chat turn is preserved — the answer comes back into the same conversation, with the same audit trail.

The single funnel: `remoteagent`

Every cross-machine agent call goes through one client (internal/connect/remoteagent/client.go). The flow:

Resolve the fleet (CLI flag → env → named fleet → active → legacy → OAuth).
Resolve the target machine (UUID → hostname → name → fuzzy prefix).
Check Online and abort fast if not.
Dial the relay, set the target machine ID on the connection.
Call AgentService.Run (unary) or AgentService.RunStream (token stream).

Timeouts are clamped to [1 ms, 600 s], default 120 s. The same funnel powers the three agent-facing tools below.

Three agent tools that use it

Tool	Shape	When to use
`ask_machine(hostname, prompt)`	Unary, returns final reply	One target, fire-and-forget
`ask_machine_stream(hostname, prompt)`	Stream, emits tokens + tool events	One target, you want UI updates
`ask_machines(hostnames, prompt, timeout_ms?)`	Fan-out, parallel goroutines	Many targets, compare answers

See report 04 §2 for the registered tool catalogue and report 05 §2.2 for the fan-out implementation.

How a single call flows

Fan-out (`ask_machines`) semantics

Per-host timeout. [1, 300] s, default 120 s.
Total deadline. [1, 600] s, default 240 s.
Dedup. Hostnames are deduplicated while preserving insertion order.
Result map. Keyed by hostname, deterministic order. Each entry is one of:
- Response — success.
- RemoteError — the target agent ran but reported an error.
- Error — our side could not reach the target (resolve, dial, offline, timeout).
Cancellation. Cancelling the parent context drops all in-flight workers; cancelled hosts appear with TimedOut: true.

Source: internal/agent/builtin/tools/connecttool/ask_machines.go:79–237.

Error taxonomy

Class	Cause	Where to look
`resolve_error`	unknown or ambiguous hostname	Machine Identity
`offline`	target `is_online=false`	`cmdop agent status` on the target
`dial_error`	network / TLS to relay	local relay logs, `cmdop agent logs -f`
`auth_error`	no API key, OAuth expired	`cmdop login`
`remote_error`	target agent ran but failed	target machine’s logs
`timeout`	per-host or total deadline fired	tighten `timeout_ms` or scope hostnames

Permission gate fires on the receiver

The caller’s outgoing ask_machine is not gated locally — the caller is the operator. The receiver’s permissions.yaml decides whether the inbound tool can execute. Self-to-self calls (same OAuth identity, verified via CallerHostname server-side) bypass the gate by design.

The receiver decides what tools the caller may invoke. See Permissions.

Self-to-self calls (same OAuth user) skip the permission gate. If you want to hard-gate every inbound call, run the receiver under a different account.

Streaming

ask_machine_stream emits typed events:

Event	Meaning
`TOKEN`	Next text fragment from the LLM.
`TOOL_START`	A tool call is about to run on the target.
`TOOL_END`	The tool finished; payload includes result snippet.
`THINKING`	Provider thinking marker (when supported).
`ERROR`	A non-fatal error during the run.
`HANDOFF`	The agent delegated to a subagent.
`CANCELLED`	The stream was cancelled by the caller.

As of 2026-04-26 the daemon ships per-token events; the desktop direct-pipe path was on unary Ask during the gap and is being flipped back. See report 05 open-question 1.

Example calls


// ask_machine — single target, unary
{
  "tool": "ask_machine",
  "args": {
    "hostname": "prod-1",
    "prompt": "Show last 50 lines of /var/log/nginx/error.log"
  }
}


// ask_machines — fan-out across three hosts
{
  "tool": "ask_machines",
  "args": {
    "hostnames": ["prod-1", "prod-2", "prod-3"],
    "prompt": "uptime",
    "timeout_ms": 30000
  }
}

A successful fan-out result map looks like:


{
  "prod-1": { "ok": true, "response": "up 12 days, load 0.41" },
  "prod-2": { "ok": false, "remote_error": "connect_session: read timeout" },
  "prod-3": { "ok": false, "error": "timeout", "timed_out": true }
}

When to use which tool

One target, fire-and-forget → ask_machine.
One target, want UI tokens → ask_machine_stream.
Many targets, want to compare answers → ask_machines.
Many targets but you need a sequential pattern (rolling deploy) → loop over ask_machine, not ask_machines. Fan-out is parallel by design.

TAGS: agent-communication, ask_machine, ask_machines, fan-out, server-to-server DEPENDS_ON: [machine-identity, permissions, daemon]

Agent Communication

Why server-to-server matters

The single funnel: `remoteagent`

Three agent tools that use it

How a single call flows

Fan-out (`ask_machines`) semantics

Error taxonomy

Permission gate fires on the receiver

Streaming

Example calls

When to use which tool

Product

Docs

Resources

Company

Agent Communication

Why server-to-server matters

The single funnel: remoteagent

Three agent tools that use it

How a single call flows

Fan-out (ask_machines) semantics

Error taxonomy

Permission gate fires on the receiver

Streaming

Example calls

When to use which tool

Related

Product

Docs

Resources

Company

The single funnel: `remoteagent`

Fan-out (`ask_machines`) semantics