Package
agent-os-kernel
Description
A pluggable policy backend (ADR-0015) has no way to say "I failed, and that
failure should bind a deny." PolicyEvaluator skips any BackendDecision
whose error is set, and when every backend errors, evaluation falls through
to the configured default — which can be allow. The result: the fail-closed
denies the bundled backends carefully construct are discarded by the evaluator
that consults them.
Both consultation sites (_evaluate_flat and _evaluate_rules in
agent-governance-python/agent-os/src/agent_os/policies/evaluator.py):
for backend in self._backends:
result = backend.evaluate(context)
if result.error is None: # <- error set => decision skipped entirely
return PolicyDecision(...)
# No rule matched — apply defaults
default_action = PolicyAction.ALLOW
if self.policies:
default_action = self.policies[0].defaults.action
The bundled backends fail closed on the backend side — and set error while
doing so, so the evaluator un-closes them:
OPABackend._evaluate_cli on timeout: BackendDecision(allowed=False, action="deny", reason="OPA eval timed out", error="timeout")
OPABackend._evaluate_remote on a transport error: allowed=False, action="deny", error=str(e)
CedarBackend on unrecognised CLI output: allowed=False, action="deny", error="unrecognised cedar CLI output"
Each of those is a deliberate deny (_is_strict_true, the missing-result
fail-closed paths, the first-line Cedar token parse all show real care here),
but because error is set the row cannot bind, and a single unreachable OPA
server turns a deny-posture backend into a no-op.
The contract also conflates two backend states that deserve different handling:
- Abstain — "I have no opinion on this context." Skipping is exactly right.
- Failure — "I tried to evaluate and could not." The rest of the
toolkit's posture (Verdict::runtime_error() → Deny in the ACS core; the
evaluator's own except → fail-closed deny) says the safe answer here is
deny.
Today both must travel through error, and both get the abstain treatment.
Possible directions (happy to PR whichever shape you prefer — ADR-0015
documents the current "first non-error result" rule, so this felt like an
issue to discuss before any code):
- Per-registration severity:
add_backend(backend, on_error="skip") with
"skip" | "deny" — evaluator-side only, no BackendDecision schema change,
default preserves current semantics exactly.
- Decision-carried channel: an explicit field on
BackendDecision so a
backend can distinguish abstain from fail-closed-deny itself; default keeps
today's behavior.
- At minimum, a docs/ADR note that backend
error means abstain, so backend
authors stop constructing deny decisions that cannot bind (the bundled
backends' error-path action="deny" rows would then want revisiting).
Context: I hit this building an external ExternalPolicyBackend adapter for a
third-party policy engine (dos-kernel — a deterministic verdict layer). The
adapter deliberately uses the skip channel for honest abstention, which is
how the conflation surfaced: abstention and failure are indistinguishable in
the seat.
Steps to Reproduce
from agent_os.policies import PolicyEvaluator
from agent_os.policies.backends import OPABackend
ev = PolicyEvaluator() # no policies loaded -> default ALLOW
# A deny-posture backend whose OPA server is unreachable:
ev.add_backend(OPABackend(mode="remote", opa_url="https://opa.invalid:8181"))
decision = ev.evaluate({"tool_name": "file_delete"})
print(decision.allowed) # True
print(decision.reason) # "No rules matched; default action applied"
The backend returned allowed=False, action="deny", reason="OPA server error: ...", error="..." — a fail-closed deny — but the
error guard skips it and the default allow binds instead.
Environment
agent_os_kernel 3.7.0 (PyPI wheel) — agent_os/policies/evaluator.py, same guard at both consultation sites
- Also confirmed on a current clone (
8fd0b61)
- Python 3.13, Windows 11
Code of Conduct
Package
agent-os-kernel
Description
A pluggable policy backend (ADR-0015) has no way to say "I failed, and that
failure should bind a deny."
PolicyEvaluatorskips anyBackendDecisionwhose
erroris set, and when every backend errors, evaluation falls throughto the configured default — which can be allow. The result: the fail-closed
denies the bundled backends carefully construct are discarded by the evaluator
that consults them.
Both consultation sites (
_evaluate_flatand_evaluate_rulesinagent-governance-python/agent-os/src/agent_os/policies/evaluator.py):The bundled backends fail closed on the backend side — and set
errorwhiledoing so, so the evaluator un-closes them:
OPABackend._evaluate_clion timeout:BackendDecision(allowed=False, action="deny", reason="OPA eval timed out", error="timeout")OPABackend._evaluate_remoteon a transport error:allowed=False, action="deny", error=str(e)CedarBackendon unrecognised CLI output:allowed=False, action="deny", error="unrecognised cedar CLI output"Each of those is a deliberate deny (
_is_strict_true, the missing-resultfail-closed paths, the first-line Cedar token parse all show real care here),
but because
erroris set the row cannot bind, and a single unreachable OPAserver turns a deny-posture backend into a no-op.
The contract also conflates two backend states that deserve different handling:
toolkit's posture (
Verdict::runtime_error()→Denyin the ACS core; theevaluator's own
except→ fail-closed deny) says the safe answer here isdeny.
Today both must travel through
error, and both get the abstain treatment.Possible directions (happy to PR whichever shape you prefer — ADR-0015
documents the current "first non-error result" rule, so this felt like an
issue to discuss before any code):
add_backend(backend, on_error="skip")with"skip" | "deny"— evaluator-side only, noBackendDecisionschema change,default preserves current semantics exactly.
BackendDecisionso abackend can distinguish abstain from fail-closed-deny itself; default keeps
today's behavior.
errormeans abstain, so backendauthors stop constructing deny decisions that cannot bind (the bundled
backends' error-path
action="deny"rows would then want revisiting).Context: I hit this building an external
ExternalPolicyBackendadapter for athird-party policy engine (dos-kernel — a deterministic verdict layer). The
adapter deliberately uses the skip channel for honest abstention, which is
how the conflation surfaced: abstention and failure are indistinguishable in
the seat.
Steps to Reproduce
The backend returned
allowed=False, action="deny", reason="OPA server error: ...", error="..."— a fail-closed deny — but theerrorguard skips it and the default allow binds instead.Environment
agent_os_kernel3.7.0 (PyPI wheel) —agent_os/policies/evaluator.py, same guard at both consultation sites8fd0b61)Code of Conduct