ADR-039: Global per-tool availability state 

Status

Accepted

Date

2026-06-30

Authors

Netresearch DTT GmbH

Context 

The tool runtime (ADR-038) gates which tools a single agent run may call through two mechanisms:

  • each ToolInterface declares isEnabledByDefault() — a compile-time default (e.g. read-only tools ship on, mutating ones ship off);
  • every run carries a per-request allow-list (the skill's allowed-tools or the playground selection), so a run only ever sees the subset it asked for.

What was missing is an operator control: an administrator could not globally turn a registered tool off for the whole instance. A tool shipping isEnabledByDefault() === true was callable by every run that allow-listed it, with no site-wide kill switch; and a default-off tool could not be switched on without a code change. Neither the per-tool default nor the per-run allow-list is the right seam for "this instance does not permit get_env at all".

Decision 

Introduce a global, per-tool availability override that sits above the per-tool default and below the per-run allow-list.

  • Storage — a dedicated table tx_nrllm_tool_state (tool_name unique, enabled boolean). It has no TCA and no FormEngine UI: it is operational state toggled from the backend, not editorial content edited as a record. A missing row falls back to the tool's isEnabledByDefault(), so the table only ever holds explicit admin overrides.
  • RepositoryToolStateRepository exposes overrides() (the sparse override map) and setEnabled(name, bool) (upsert one override).
  • Effective-state serviceToolAvailabilityService computes the authoritative "what may run at all" set: for every registered tool the effective state is its admin override when one exists, otherwise its isEnabledByDefault(). enabledNames() returns the enabled subset; states() returns the full name / description / enabled / defaultEnabled rows the backend renders.
  • Runtime enforcementToolLoopService intersects every per-run allow-list with enabledNames(), so a globally-disabled tool can never be invoked regardless of what a skill or the playground requested. This is the same defense-in-depth layering as the acting-user RBAC intersection in ADR-038 — the allow-list narrows, it never widens.
  • Backend surface — the toggles are rendered and persisted by the dedicated Tools backend module (ToolController), split out from the interactive Playground module so managing availability and running the agent loop are separate admin concerns (see the two-module split). toggleToolAction() is admin-guarded (ADR-037) and writes through ToolStateRepository::setEnabled().

Consequences 

  • Administrators get a site-wide kill switch per tool, independent of code defaults and of any individual run's allow-list.
  • Availability resolves in two steps: the effective global state is the admin override when one exists, otherwise the compile-time default (so an override can enable a default-off tool or disable a default-on one — it replaces the default, it does not merely narrow it). The per-run allow-list is then intersected with that effective set, so a run can only ever narrow what is globally enabled — a globally-disabled tool can never be called, but the allow-list can never re-enable one.
  • The table is deliberately TCA-less: it is a small operational toggle set keyed by tool_name, not a versioned/localisable record, so a bespoke toggle endpoint is a better fit than FormEngine (and avoids exposing an editable "tool" record that implies more than a boolean).
  • Because a missing row falls back to the tool default, shipping a new tool needs no data migration: its isEnabledByDefault() applies until an admin overrides it.
  • Reads go through ToolAvailabilityService on every agent run; the override map is a single small query, cheap relative to the LLM calls it gates.

Alternatives considered 

  • Reuse the per-run allow-list only — rejected: the allow-list is authored per skill/run and cannot express an instance-wide policy; a globally-forbidden tool would have to be scrubbed from every skill.
  • Flip isEnabledByDefault() in code — rejected: the default is a ship-time property of the tool, not per-instance operator policy, and changing it requires a release.
  • A TCA-backed ``tool`` record — rejected: tools are code-registered, not editable entities; a full record UI would imply create/delete/localise semantics that do not apply to a boolean override keyed by a code identifier.