hermes-agent/agent
sasquatch9818 020d263ef6 fix(agent): defang untrusted-tool-result delimiter against tag injection
`_maybe_wrap_untrusted` is the architectural defense against indirect
prompt injection. It wraps attacker-controllable tool output
(web_extract, web_search, browser_*, mcp_*) in
`<untrusted_tool_result>...</untrusted_tool_result>` so the model treats
it as data. The content was interpolated verbatim, so the boundary was
forgeable.

Two holes. A poisoned page that embeds `</untrusted_tool_result>` closes
the block early — everything after it reads as trusted instructions. And
the `startswith("<untrusted_tool_result")` re-entrancy guard returned
content that merely started with the opening tag completely unwrapped, so
an attacker just prefixed the tag to drop all data framing.

Fix neutralizes any embedded delimiter token (case-insensitive) before
interpolation and drops the forgeable fast-path, so content is always
sealed in exactly one well-formed block. Re-wrapping an already-wrapped
forward is harmless — it stays framed as data.

## What does this PR do?

Closes an indirect prompt-injection bypass in the untrusted-tool-result
wrapper. Attacker content can no longer break out of, or forge, the
trust boundary.

## Related Issue

N/A

## Type of Change

- [x] 🔒 Security fix

## Changes Made

- `agent/tool_dispatch_helpers.py`: add `_neutralize_delimiters` (case-insensitive defang of the `untrusted_tool_result` token); `_maybe_wrap_untrusted` now always neutralizes then wraps, and the forgeable `startswith` re-entrancy guard is removed.
- `tests/agent/test_tool_dispatch_helpers.py`: replace the double-wrap test (it encoded the bypass) with regression tests for embedded closing tag, leading opening tag, and a cased closing tag.

## How to Test

1. `scripts/run_tests.sh tests/agent/test_tool_dispatch_helpers.py` — 29 pass.
2. Embedded `</untrusted_tool_result>` mid-content: real closing delimiter appears once, at the end; payload trapped inside.
3. Content starting with the opening tag: data framing is applied, not skipped.

## Checklist

### Code

- [x] I've read the Contributing Guide
- [x] My commit messages follow Conventional Commits
- [x] I searched for existing PRs to make sure this isn't a duplicate
- [x] My PR contains only changes related to this fix
- [x] I've run the affected tests and they pass
- [x] I've added tests for my changes
- [x] I've tested on my platform: macOS 15 (Darwin 25.5)

### Documentation & Housekeeping

- [x] I've updated relevant documentation (docstrings) — or N/A
- [x] cli-config.yaml.example — N/A
- [x] CONTRIBUTING.md / AGENTS.md — N/A
- [x] Cross-platform impact — N/A (pure-Python, stdlib `re`)
- [x] Tool descriptions/schemas — N/A
2026-07-01 01:54:45 -07:00
..
lsp feat(lsp): add PowerShellEditorServices language server (#55930) 2026-06-30 16:22:18 -07:00
pet fix(pet): snap kitty frames to whole cells 2026-06-30 15:41:44 -05:00
secret_sources fix: prevent TUI gateway stdin EOF crash across all TUI-context subprocess calls 2026-06-08 22:46:57 -07:00
transports fix(agent): guard against non-dict model_extra in tool call normalization 2026-06-30 03:27:12 -07:00
__init__.py fix(agent): preload jiter native parser 2026-05-28 00:20:11 -07:00
account_usage.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
agent_init.py revert: back out prompt_caching.enabled toggle (#56105) for re-evaluation (#56126) 2026-07-01 00:20:32 -07:00
agent_runtime_helpers.py revert: back out prompt_caching.enabled toggle (#56105) for re-evaluation (#56126) 2026-07-01 00:20:32 -07:00
anthropic_adapter.py fix(anthropic): stop SDK auto-retry double-firing and raise Retry-After cap to 600s 2026-06-27 19:23:15 -07:00
async_utils.py fix(async): close unscheduled coroutines in all threadsafe bridges (#26584) 2026-05-15 14:00:01 -07:00
auxiliary_client.py fix(runtime): honor NOUS_INFERENCE_BASE_URL across pool/explicit/aux paths 2026-07-01 01:52:06 -07:00
azure_identity_adapter.py feat(azure-foundry): add Microsoft Entra ID auth 2026-05-18 10:14:38 -07:00
background_review.py fix(bg-review): scope stdout/stderr silencing to the worker thread (#55966) 2026-06-30 17:28:33 -07:00
bedrock_adapter.py fix(bedrock): check boto3 version >= 1.34.59 before using converse_stream 2026-06-15 05:25:17 -07:00
billing_view.py feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449) 2026-06-19 01:53:32 +05:30
browser_provider.py fix(browser): self-review pass — dead-import, log levels, future-proofing 2026-05-17 04:04:15 -07:00
browser_registry.py style: restore PEP8 blank-line separation after dead-code removal 2026-05-29 04:22:27 -07:00
chat_completion_helpers.py fix(anthropic+feishu): model-gate max_tokens fallback; wire Feishu channel_prompt 2026-06-30 17:20:41 -07:00
codex_responses_adapter.py fix(xai): OAuth Responses native web_search, incomplete guard, grok-composer context 2026-06-17 17:33:32 -07:00
codex_runtime.py fix(codex): seed app-server sessions with configured cwd 2026-06-21 16:39:02 -07:00
coding_context.py feat(agent): add configurable coding_instructions 2026-06-30 00:59:59 -05:00
context_breakdown.py feat(desktop): add context usage breakdown popover 2026-06-29 09:18:10 -04:00
context_compressor.py fix(compressor): pin summary role to user when only system prompt is protected (#52160) 2026-07-01 14:24:41 +05:30
context_engine.py fix(context): clamp -1 post-compression sentinel in sibling status paths 2026-07-01 13:36:50 +05:30
context_references.py perf(context-refs): expand @-references concurrently 2026-06-30 00:19:49 -07:00
conversation_compression.py fix(agent): make compression lock-lease refresher tolerate transient DB blips 2026-06-30 13:36:29 +05:30
conversation_loop.py feat(classifier): Anthropic-specific guidance for subscription exhaustion 2026-07-01 01:36:34 -07:00
copilot_acp_client.py fix(agent): stream copilot ACP chat completions 2026-06-28 22:52:51 -07:00
credential_persistence.py fix: avoid persisting borrowed credential secrets (#31416) 2026-05-25 00:32:08 -07:00
credential_pool.py test(credential_pool): cover Anthropic env auth_type classification 2026-06-30 17:29:03 -07:00
credential_sources.py docs(auth): replace stale 'hermes login' references with 'hermes auth add' 2026-05-26 15:41:11 -07:00
credits_tracker.py feat(billing): /credits command — balance + portal top-up handoff (#44776) 2026-06-12 08:51:10 +00:00
curator.py fix(curator): never archive cron-referenced skills + floor use=0 pruning (#54443) 2026-06-28 15:10:21 -07:00
curator_backup.py fix(curator): stop the rollback safety snapshot from pruning its target 2026-06-17 05:40:05 -07:00
display.py feat(display): friendly human-phrased tool labels for built-in tools (#55166) 2026-06-29 20:31:17 -07:00
error_classifier.py fix(classifier): treat Anthropic "out of extra usage" 400 as billing 2026-07-01 01:36:34 -07:00
errors.py fix(agent,gateway,doctor): add SSL CA cert bundle fail-fast guard 2026-06-13 21:14:32 -07:00
file_safety.py fix(file): block credential paths from search results 2026-07-01 01:02:35 -07:00
gemini_native_adapter.py Merge consecutive same-role contents for native Gemini 2026-06-30 11:51:22 -07:00
gemini_schema.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
i18n.py fix(packaging): ship locales/ i18n catalogs in wheel, sdist, and Nix (#38383) 2026-06-03 12:00:27 -07:00
image_gen_provider.py feat(image-gen): add image-to-image / editing to image_generate (#48705) 2026-06-18 22:13:07 -07:00
image_gen_registry.py fix(plugins): filter resolution by is_available() in web + image_gen registries 2026-05-13 22:31:28 -07:00
image_routing.py fix(vision): detect Ollama vision models via /api/show (#54511) 2026-06-28 22:52:59 -07:00
insights.py refactor(insights): drop dead pricing/duration wrappers, call usage_pricing directly (#40618) 2026-06-07 18:33:20 -07:00
iteration_budget.py refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget 2026-05-16 17:59:32 -07:00
jiter_preload.py fix(agent): preload jiter native parser 2026-05-28 00:20:11 -07:00
learn_prompt.py fix(learn): honor requirements mixed with sources in /learn requests (#55956) 2026-06-30 16:56:01 -07:00
learning_graph.py fix(desktop): scope memory graph cache by profile 2026-06-30 03:44:41 -05:00
learning_graph_render.py fix(journey): swap skill/memory inks so drillable rows read as clickable 2026-06-30 11:54:16 -05:00
learning_mutations.py refactor(journey): route memory mutations through MemoryStore atomic I/O 2026-06-30 15:16:21 -05:00
lmstudio_reasoning.py feat(agent): add lmstudio integration 2026-04-28 12:27:36 -07:00
manual_compression_feedback.py fix(compression): include system prompt + tool schemas in token estimates (#18265) 2026-04-30 23:03:54 -07:00
markdown_tables.py fix(cli): vertical fallback for markdown tables wider than terminal (#23948) 2026-05-11 16:49:13 -07:00
memory_manager.py fix(agent): validate context/memory tool schemas before wrapping 2026-06-25 02:17:29 +05:30
memory_provider.py fix(backup): capture memory-provider state stored outside HERMES_HOME (#50325) 2026-06-21 12:03:46 -07:00
message_content.py fix(openviking): preserve structured sync attribution 2026-06-19 15:23:41 +08:00
message_sanitization.py fix(agent): close tool-call sequence on all interrupt aborts, not just finalize_turn 2026-06-25 12:24:34 -05:00
moa_loop.py feat(moa): opt-in full-turn trace persistence to JSONL (#56101) 2026-07-01 00:09:42 -07:00
moa_trace.py feat(moa): opt-in full-turn trace persistence to JSONL (#56101) 2026-07-01 00:09:42 -07:00
model_metadata.py fix(copilot): recognize enterprise subdomains in host checks 2026-06-30 03:27:41 -07:00
models_dev.py remove Vercel AI Gateway and Vercel Sandbox (#33067) 2026-05-27 00:43:32 -07:00
moonshot_schema.py fix(moonshot): handle union type arrays in tool schemas 2026-06-13 05:51:41 -07:00
nous_rate_guard.py codebase: add encoding='utf-8' to all bare open() calls (PLW1514) 2026-05-08 14:27:40 -07:00
onboarding.py feat(onboarding): opt-in structured profile-build path on first contact (#41114) 2026-06-07 08:36:48 -07:00
oneshot.py feat(agent): one-shot LLM helper + llm.oneshot gateway RPC (#51261) 2026-06-23 08:01:50 +00:00
plugin_llm.py feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194) 2026-05-10 07:09:28 -07:00
portal_tags.py feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779) 2026-05-12 20:49:20 -07:00
process_bootstrap.py fix(auxiliary): use env-only proxy policy for OpenAI SDK clients (#53702) 2026-06-27 21:22:49 -07:00
prompt_builder.py fix(agent): limit .hermes.md parent walk to git repos only 2026-06-28 20:46:32 -07:00
prompt_caching.py fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778) 2026-05-12 20:46:04 -07:00
rate_limit_tracker.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
reasoning_timeouts.py fix(agent): detect thinking-timeout for reasoning models and surface actionable guidance instead of misleading file-write advice 2026-06-25 19:00:48 -07:00
redact.py fix(browser): close remaining CDP-URL leak paths in supervisor (review) 2026-07-01 13:43:58 +05:30
replay_cleanup.py fix(tui): sanitize replay history on WebUI/TUI session resume (#29086) (#53939) 2026-06-27 20:56:49 -07:00
retry_utils.py fix: handle named custom providers and Z.AI overload retries 2026-06-25 00:17:17 -07:00
runtime_cwd.py fix(desktop): stabilize project folder sessions (#37586) 2026-06-02 20:23:09 +00:00
secret_scope.py feat(gateway): multiplex phase 2 — fail-closed profile credential isolation (Workstream A) 2026-06-19 07:34:15 -07:00
shell_hooks.py feat(agent): add pre_verify hook and verify-on-stop coding guidance 2026-06-30 00:59:29 -05:00
skill_bundles.py feat(skills): add skill bundles — alias /<name> loads multiple skills (#28373) 2026-05-18 21:38:05 -07:00
skill_commands.py fix(memory): strip skill scaffolding for all providers, not just openviking 2026-06-16 10:37:37 -07:00
skill_preprocessing.py fix(windows): hide console-window flash on backend git/gh/wmic/bash subprocess spawns 2026-06-28 05:28:45 -07:00
skill_utils.py fix(curator): protect external skills from background curation 2026-06-25 22:03:02 -07:00
ssl_guard.py fix(ssl): align guard docs and escape hatch 2026-06-13 21:14:32 -07:00
stream_diag.py feat(agent): buffer retry/fallback status, surface only on terminal failure (#33816) 2026-05-28 04:53:27 -07:00
subdirectory_hints.py fix(subdirectory_hints): prevent loading AGENTS.md outside workspace 2026-05-25 23:17:33 -07:00
system_prompt.py feat(computer_use): cross-platform cua-driver (macOS/Windows/Linux) 2026-06-22 06:42:30 -07:00
think_scrubber.py fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924) (#20184) 2026-05-05 04:33:38 -07:00
thinking_timeout_guidance.py fix(agent): detect thinking-timeout for reasoning models and surface actionable guidance instead of misleading file-write advice 2026-06-25 19:00:48 -07:00
thread_scoped_output.py fix(bg-review): scope stdout/stderr silencing to the worker thread (#55966) 2026-06-30 17:28:33 -07:00
title_generator.py feat(titles): support language-aware title generation (#45296) 2026-06-19 17:15:52 -07:00
tool_dispatch_helpers.py fix(agent): defang untrusted-tool-result delimiter against tag injection 2026-07-01 01:54:45 -07:00
tool_executor.py feat(display): friendly human-phrased tool labels for built-in tools (#55166) 2026-06-29 20:31:17 -07:00
tool_guardrails.py fix: add recovery hints to loop guard warnings 2026-05-19 00:12:12 -07:00
tool_result_classification.py fix: classify landed file mutations with diagnostics 2026-05-13 06:46:23 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
transcription_provider.py feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
transcription_registry.py feat(stt): add register_transcription_provider() plugin hook 2026-05-25 01:41:19 -07:00
tts_provider.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
tts_registry.py feat(tts): add register_tts_provider() plugin hook (closes #30398) 2026-05-24 18:04:54 -07:00
turn_context.py fix(memory): degrade gracefully after repeated at-capacity consolidation failures (#42405) 2026-06-30 20:01:16 +05:30
turn_finalizer.py fix(agent,gateway): surface partial-stream recovery and bound detached restart 2026-06-27 22:03:14 -07:00
turn_retry_state.py fix(agent): route content-filter stream stalls to fallback chain (#32421) 2026-06-28 01:15:21 -07:00
usage_pricing.py fix(moa): count reference (advisor) fan-out token usage + cost (#56087) 2026-06-30 23:08:37 -07:00
verification_evidence.py feat(agent): recognize focused ad-hoc verification scripts 2026-06-24 23:03:45 -05:00
verification_stop.py feat(agent): restore surface-aware "auto" default for verify_on_stop 2026-06-30 01:43:08 -05:00
verify_hooks.py feat(agent): add pre_verify hook and verify-on-stop coding guidance 2026-06-30 00:59:29 -05:00
video_gen_provider.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
video_gen_registry.py feat(video_gen): unified video_generate tool with pluggable provider backends (#25126) 2026-05-13 16:39:41 -07:00
web_search_provider.py chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00
web_search_registry.py chore(web): remove web_crawl tool + provider crawl plumbing (#33824) 2026-05-28 04:52:42 -07:00