Follow-up: device tools in DURABLE (Temporal) runs (a.k.a. "P5")¶
Status: NOT built (deliberately deferred). Device agents (the per-user Linux machines that announce coding tools) are fully wired for inline chat runs — registry, WS connectivity, the device toolset, the approval/policy gate, and the global tool guard all work and are deployed. This document records the one remaining gap and exactly how to close it, so it can be picked up later.
What works today (inline)¶
A run started interactively executes inline (an asyncio task in the FastAPI process,
streamed over SSE). There, ToolsetAssembler.assemble() reads chat.run_config["devices"],
fetches the owned + online devices, and appends a build_device_toolset(...) per device
(backend/src/personal_agent/agent/device_toolset.py). Each tool call goes through
agent/device_policy.gate_device_call (autonomous / allow-rule / judge / human approval) and
dispatches over the in-process DeviceGateway (realtime/device_gateway.py).
What's missing (durable)¶
A durable run executes as a Temporal workflow in worker/. Per Frozen Contract #6, the
durable path does NOT query live DB state mid-run — toolsets are snapshotted into
RunSpec.toolsets (packages/personal-agent-contracts/.../runspec.py ToolsetSnapshot) at start and
the worker rebuilds them from the snapshot via DynamicToolsets
(worker/src/personal_agent_worker/integration_toolsets.py, registered in agents.py).
Device tools are not in that snapshot, so a durable run with a device selected silently has
no device tools. This only happens when a user sends a chat run "in the background" with a
device selected — triggered workflows/triage never set run_config.devices (and must not: they run over
untrusted content), so there is no functional regression, just an unsupported edge case.
How to build it¶
-
Snapshot —
runspec.py: addDeviceSnapshot(BaseModel, frozen)withdevice_id: str+announced_tools: dict(frozen JSON schemas, Contract #6) andToolsetSnapshot.devices: tuple[DeviceSnapshot, ...] = (). Fill it inbackend/src/personal_agent/api/routers/runs.py::_integration_snapshot(mirror the integration snapshot): for each owned online device incfg["devices"], freeze itsannounced_tools. -
Deps —
backend/src/personal_agent/agent/deps.py: adddevice_ids: list[str] = []toPersonalAgentDeps(set on the durable path alongsideintegration_entry_ids). -
Worker toolset —
worker/src/personal_agent_worker/integration_toolsets.py: adddevice_dynamic_toolset()(mirrorintegration_dynamic_toolset) whose in-activity_buildrebuildsbuild_device_toolset(...)from the snapshot (NOT a live DB query — use the frozenannounced_tools; construct a lightweight device-like object carrying id + name + policy_mode + announced_tools, fetched once in-activity by id is acceptable since the activity may do I/O, but prefer the snapshot for the tool schemas). Register it inworker/src/personal_agent_worker/agents.pyalongside the other dynamic toolsets. -
Worker gateway + gate — the worker holds no device WS connections (those live on the API pods), so the worker needs its own
DeviceGateway(redis, pubsub_redis, pod_id)instance (wire it in the worker's resource bootstrap).dispatchwill always take the cross-pod path (_dispatch_remote): publish onpersonal_agent:device:<id>:rpc, the API pod holding the WS forwards it, the reply comes back on the per-request reply channel. The approval gate (gate_device_call/request_tool_approval) already works from any process — it polls thedevice_approvalsrow (shared DB) and pushes thetool_approvalframe over Redis user-events, so it works unchanged inside a Temporal activity (mind activity heartbeats for the up-to-10-min approval wait — heartbeat or raise the activity'sstart_to_close_timeout). -
Global guard in durable —
GuardToolset(agent/tool_guard.py) is aWrapperToolset; wrap the worker's combined toolset the same way the inline assembler does whenGuardConfigis enabled (resolve guard enablement once at snapshot time → carry a flag inPersonalAgentDeps, or re-read in-activity).
Verification when built¶
test_device_durable (conformance): a durable run with a device selected gets the device tools
(inline ≡ durable), a run_command call routes cross-pod to the agent and returns, and the
approval gate works inside the activity. Plus the existing inline device tests stay green.