Open standard · Neutral benchmark
The open standard for
AI agent safety & security
One protocol for every agent, every sandbox, every LLM. One benchmark that ranks every vendor. What OpenTelemetry did for observability, OpenGuardrails does for agent safety & security.
Apache-2.0 · Foundation-neutral governance · Detectors compete, you compose
The mess today
Securing an agent is an N×M×L×S integration problem — every agent × every detector × every LLM protocol × every sandbox, wired pairwise. Pick a vendor and you're locked in; switch and you re-integrate everything.
With OpenGuardrails
Collapses to N+M+L+S. Integrate once against the contract. Compose any vendors with deny-wins or quorum. Switch freely. One config across every agent you run.
How it works
Standardize the boundary, not the brains
Three altitudes, one decision
Gateway (messages, MCP, skills, tools), agent hook, and sandbox (real exec/network/files) observe one action — correlated by guard_id.
Provenance-first
Trust labels travel with the action, so OGR catches the dangerous combination — untrusted input → privileged action — not just bad strings.
Safety and security
Harmful content judged at the I/O boundary; system compromise judged on actions and data flow, compilable into the sandbox.
The neutral benchmark · seed-v0
We don't compete. We referee.
A vendor's score is meaningless until it's measured on common data by a common harness. We run that harness. Submit a conformant detector — config or model — and appear on the board. Numbers below are real outputs of reference detectors on the seed suite; we never fabricate a vendor's score.
| Detector | Type | Injection | Malicious-cmd | Exfil | Secret-leak | Macro F1 |
|---|---|---|---|---|---|---|
| keyword-baseline | baseline | 0.400 | 0.800 | 0.667 | 0.667 | 0.634 |
| ogr-compose (config⊕llm) | hybrid | 0.889 | 0.667 | 0.545 | 0.400 | 0.625 |
| block-all | baseline | 0.625 | 0.625 | 0.571 | 0.571 | 0.598 |
| config-rules | config | 0.333 | 0.667 | 0.400 | 0.400 | 0.450 |
| llm-judge (provenance-aware) | model | 0.889 | 0.333 | 0.400 | 0.000 | 0.406 |
| allow-all | baseline | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| LlamaGuard | model | — | — | — | — | — |
| Qwen3Guard | model | — | — | — | — | — |
Seed suite: injection 10 · malicious-command 10 · exfil 8 · secret-leak 8 · shared benign 12. Reproduce: python3 harness/run.py. openguardrails-bench →
For agent & platform builders
Add one hook, get every vendor's coverage. Compose with deny-wins / quorum. One policy across all your agents.
Runnable PoC: Hermes agent + sandbox →For security & safety vendors
Implement one method — evaluate(GuardEvent) → Verdict — and get ranked distribution to every agent. Compete on detection, not integration.
Read the spec →Proof it runs