Eval Gates: conditional rules that decide when to escalate
Eval gates combine deterministic and LLM-evaluated conditions to let an agent run autonomously on 95% of cases and escalate only the 5% that need a human.
An eval gate is a rule that decides whether the agent can proceed autonomously or must escalate to a human. It sits on top of Approval Workflows: approvals are the mechanism for pausing; eval gates are the conditions that trigger pauses.
Gates make the difference between an autonomous workflow that's safe-by-default and one that requires constant babysitting. With well-tuned gates, an agent can handle 95% of the cases autonomously and surface only the 5% that need a human.
The short version
- A gate is a rule with
when(condition) andthen(what to do) - Conditions can be deterministic (regex, threshold, field-match) or LLM-evaluated ("does this email request payment?")
- Actions include:
escalate_to_approval,decline,mark_spam,tag_and_continue - Gates are workspace-scoped and inheritable from agency-level templates
- Versioned + auditable — every gate change leaves a trail
Why we modeled it this way
Three design constraints shaped this:
1. Most automation is safe; some isn't. An agent reading inbound link-building emails should be able to autonomously decline obvious spam, draft replies to friendly requests, and bring you a relationship update at week's end. But if someone offers $2,000 for a do-follow link, you want eyes on that. Static "approve everything" is too slow; "approve nothing" defeats the purpose. Gates split the cases.
2. The rules can't all be code. Half the rules we need are LLM-evaluable judgments ("is this person legit?", "does this offer feel scammy?", "is the partner's site a content farm?"). A rule language needs both deterministic checks AND LLM-based ones.
3. Different workspaces have different risk tolerances. A solo operator might auto-decline anything under DR 30; an enterprise compliance team might escalate anything that mentions payment. Same rule language; different defaults per workspace.
Rule shape
- id: gate_payment_request
name: Payment request escalation
description: Escalate inbound messages that mention payment
applies_to: link_building.inbound
enabled: true
when:
any:
- contains:
[
"$",
"USD",
"EUR",
"GBP",
"payment",
"fee",
"rate card",
"compensation",
]
- llm_check: "Is the sender asking for money or implying a paid arrangement?"
then:
action: escalate_to_approval
reason: Partner mentioned payment terms — manual review required
tag: payment_requestEach gate has:
id— stable identifierapplies_to— which surface(s) the gate is evaluated against (link_building.inbound,produce.publish,produce.evaluate,agent.invoke, ...)when— the condition, composable withany/all/nonethen— what happens when it matchesenabled— toggle without deleting
Available when predicates
| Predicate | Example | Notes |
|---|---|---|
contains | contains: ["$", "USD"] | Literal substring match |
regex | regex: "(?i)\\bpayment\\b" | Standard regex |
field_gt / field_lt / field_eq | field_gt: { senderDR: 50 } | Threshold against an enriched field |
field_in / field_not_in | field_in: { senderStatus: ["DO_NOT_CONTACT"] } | Enum match |
llm_check | llm_check: "Is this a phishing attempt?" | LLM-evaluated yes/no |
llm_score | llm_score: { prompt: "How professional is the sender?", min: 0.7 } | LLM-evaluated 0-1 score with threshold |
prior_event | prior_event: { type: REJECTED, withinDays: 90 } | Looks back at the event log |
quiet_hours | quiet_hours: { start: "22:00", end: "06:00", timezone: "America/New_York" } | Time of day check |
Available then actions
| Action | Effect |
|---|---|
escalate_to_approval | Pause invocation at WAITING_APPROVAL, surface the preview to the approver |
decline | Auto-decline with a templated response; logs the gate match for audit |
mark_spam | Move to spam; no response sent |
tag_and_continue | Add a tag to the resource and proceed (useful for soft-flag patterns) |
route_to | Route the approval to a specific user / channel / webhook target |
notify | Fire a webhook with the gate-match details without pausing |
Composition
when:
all:
- field_gt: { senderDR: 30 }
- any:
- contains: ["guest post", "guest article"]
- llm_check: "Is the sender pitching a guest post?"
- none:
- prior_event: { type: REJECTED, withinDays: 180 }
then:
action: tag_and_continue
tag: auto_drafted_guest_postThis means: "if the sender's DR is above 30, AND it looks like a guest-post pitch, AND we haven't rejected them in the last 180 days — auto-draft a response and tag it."
How gates interact with other concepts
| Concept | Relationship |
|---|---|
| Approval Workflows | Gates are the conditional layer above approvals; the escalate_to_approval action raises an Approval |
| Link Building · inbound | The heaviest user of gates — every inbound message is evaluated |
| Production · publish | Gates can block publishing if word count below threshold, missing primary keyword, evaluation score below floor |
| Agent | Gates run as part of the agent invocation lifecycle |
CRUD: managing gates
# List all gates for the workspace
curl -G .../v1/link-building/inbound/eval-gate
# Create a new gate (YAML or JSON)
curl -X POST .../v1/link-building/inbound/eval-gate \
-d '{
"name": "Low-DR auto-decline",
"applies_to": "link_building.inbound",
"enabled": true,
"when": { "field_lt": { "senderDR": 20 } },
"then": { "action": "decline", "reason": "Site quality below threshold" }
}'
# Update
curl -X PATCH .../v1/link-building/inbound/eval-gate/gate_*** \
-d '{ "enabled": false }'
# Delete
curl -X DELETE .../v1/link-building/inbound/eval-gate/gate_***
# Test a gate against a historical message
curl -X POST .../v1/link-building/inbound/eval-gate/gate_***/test \
-d '{ "messageId": "msg_***" }'Common patterns
1. Sensible defaults for link-building inbound
- id: auto_decline_low_dr
when: { field_lt: { senderDR: 20 } }
then: { action: decline }
- id: escalate_payment_requests
when: { llm_check: "Is the sender asking for money?" }
then: { action: escalate_to_approval }
- id: auto_draft_guest_posts_from_friends
when:
all:
- field_gt: { senderDR: 30 }
- prior_event: { type: LINK_PLACED, anytime: true }
then: { action: tag_and_continue, tag: friend_request }
- id: mark_obvious_spam
when: { llm_check: "Does this look like template-based spam?" }
then: { action: mark_spam }2. Publishing safety net
- id: block_low_quality_publish
applies_to: produce.publish
when:
any:
- field_lt: { wordCount: 600 }
- field_lt: { evaluationScore: 70 }
- field_eq: { hasPrimaryKeyword: false }
then: { action: escalate_to_approval, reason: "Quality threshold not met" }3. Quiet hours
- id: no_late_night_outreach
applies_to: link_building.campaign.send_email
when:
{
quiet_hours:
{ start: "20:00", end: "07:00", timezone: "America/New_York" },
}
then: { action: escalate_to_approval, reason: "Outside business hours" }4. Agency-level defaults
# Define gates at the agency account level; workspaces inherit
curl -X POST .../v1/agency/eval-gate -d '{ ... }'
# Workspaces can override or disable inherited gates
curl -X PATCH .../v1/link-building/inbound/eval-gate/inherited_*** \
-d '{ "enabled": false }'5. A/B testing a gate
Mark a gate mode: shadow and it runs but does not act — only logs what it would have done. Useful when you're tuning a rule before letting it actually block work.
Auditing
Every gate match is logged: evaluatedAt, gateId, gateVersion, inputs (the fields the gate evaluated), outcome, action. Surface in the dashboard or query via:
curl -G .../v1/audit/eval-gates --data-urlencode "gateId=gate_***"Related
- Concept: Approval Workflows
- Concept: Agent
- API: Link Building · inbound (heavy user)
- API: Production · publish
- Playbook: Auto-respond to link requests
- Playbook: Client-approval gated publishing
Approval Workflows
How CitationBench pauses any agent step for human approval — durable, async, policy-layered — so agencies can run autonomous workflows that still earn client trust.
Inventory
Master catalog of every CitationBench REST and MCP endpoint, grouped by pillar (agent, research, production, indexing, link building) with links to full request and response samples.