hermes-agent

History

sasquatch9818 020d263ef6 fix(agent): defang untrusted-tool-result delimiter against tag injection `_maybe_wrap_untrusted` is the architectural defense against indirect prompt injection. It wraps attacker-controllable tool output (web_extract, web_search, browser_, mcp_) in `<untrusted_tool_result>...</untrusted_tool_result>` so the model treats it as data. The content was interpolated verbatim, so the boundary was forgeable. Two holes. A poisoned page that embeds `</untrusted_tool_result>` closes the block early — everything after it reads as trusted instructions. And the `startswith("<untrusted_tool_result")` re-entrancy guard returned content that merely started with the opening tag completely unwrapped, so an attacker just prefixed the tag to drop all data framing. Fix neutralizes any embedded delimiter token (case-insensitive) before interpolation and drops the forgeable fast-path, so content is always sealed in exactly one well-formed block. Re-wrapping an already-wrapped forward is harmless — it stays framed as data. ## What does this PR do? Closes an indirect prompt-injection bypass in the untrusted-tool-result wrapper. Attacker content can no longer break out of, or forge, the trust boundary. ## Related Issue N/A ## Type of Change - [x] 🔒 Security fix ## Changes Made - `agent/tool_dispatch_helpers.py`: add `_neutralize_delimiters` (case-insensitive defang of the `untrusted_tool_result` token); `_maybe_wrap_untrusted` now always neutralizes then wraps, and the forgeable `startswith` re-entrancy guard is removed. - `tests/agent/test_tool_dispatch_helpers.py`: replace the double-wrap test (it encoded the bypass) with regression tests for embedded closing tag, leading opening tag, and a cased closing tag. ## How to Test 1. `scripts/run_tests.sh tests/agent/test_tool_dispatch_helpers.py` — 29 pass. 2. Embedded `</untrusted_tool_result>` mid-content: real closing delimiter appears once, at the end; payload trapped inside. 3. Content starting with the opening tag: data framing is applied, not skipped. ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix - [x] I've run the affected tests and they pass - [x] I've added tests for my changes - [x] I've tested on my platform: macOS 15 (Darwin 25.5) ### Documentation & Housekeeping - [x] I've updated relevant documentation (docstrings) — or N/A - [x] cli-config.yaml.example — N/A - [x] CONTRIBUTING.md / AGENTS.md — N/A - [x] Cross-platform impact — N/A (pure-Python, stdlib `re`) - [x] Tool descriptions/schemas — N/A		2026-07-01 01:54:45 -07:00
..
acp	fix: bound threat-pattern/FTS5 regex input and cover V4A Move-File edits	2026-07-01 01:05:28 -07:00
acp_adapter
agent	fix(agent): defang untrusted-tool-result delimiter against tag injection	2026-07-01 01:54:45 -07:00
ci	fix(ci): classify should default to no MCP	2026-06-23 10:32:27 -07:00
cli	fix(cli): route /sessions and /history through prompt_toolkit-safe printing	2026-07-01 01:25:43 -07:00
computer_use	feat(computer_use): disable cua-driver telemetry by default, add opt-in (#50842 )	2026-06-22 09:57:16 -07:00
cron	security(cron): fail closed in scheduler backstop when validator errors	2026-07-01 14:23:01 +05:30
docker	fix(s6): dot-prefix gateway staging dir so svscan ignores it mid-build (#54834 )	2026-06-29 21:33:00 +10:00
e2e	fix(gateway): route SessionDB calls through AsyncSessionDB	2026-06-29 15:51:57 -07:00
fakes
fixtures/plugins/example-dashboard/dashboard
gateway	fix(gateway): persist compressed transcript before repointing /compress session	2026-07-01 01:39:23 -07:00
hermes_cli	fix(runtime): honor NOUS_INFERENCE_BASE_URL across pool/explicit/aux paths	2026-07-01 01:52:06 -07:00
hermes_state	fix(state): exclude delegate/branch/tool children from resume walk + reconcile salvaged fixes	2026-06-25 16:29:09 -07:00
honcho_plugin	feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335 )	2026-06-22 19:16:47 -05:00
integration	feat(web_extract): truncate-and-store instead of LLM summarization (#54843 )	2026-06-29 10:00:49 -07:00
openviking_plugin	feat(openviking): add full recall prefetch policy	2026-06-24 18:53:49 +05:30
plugins	fix(memory/holographic): sanitize FTS5 queries for natural-language recall	2026-06-30 15:55:11 -07:00
providers	fix(models): pass model.base_url to fetch_models in /model picker	2026-06-16 13:09:40 -07:00
run_agent	fix(agent): never persist empty-response recovery scaffolding	2026-07-01 01:08:27 -07:00
scripts	revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853 )	2026-06-27 15:59:00 -07:00
skills	feat(skills): add cloudflare-temporary-deploy optional skill (#50849 )	2026-06-22 12:14:30 -07:00
stress
tools	fix(security): anchor rm hardline rules to command position (#56193 )	2026-07-01 01:54:43 -07:00
tui_gateway	fix(tui_gateway): drop emit-only session.info from _LONG_HANDLERS	2026-06-30 03:11:13 -07:00
website
__init__.py
conftest.py	feat(managed-scope): add managed_scope module (resolver, loaders, key helpers)	2026-06-19 07:46:33 -07:00
run_interrupt_test.py
test_account_usage.py
test_assistant_ui_tap_compat.py	test(deps): guard @assistant-ui cluster on one tap version	2026-06-15 11:55:02 -04:00
test_atomic_replace_symlinks.py	fix(utils): copy fallback for atomic replace across devices (#43852 )	2026-06-13 14:50:05 -07:00
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_bitwarden_secrets.py
test_cli_file_drop.py
test_cli_manual_compress.py
test_cli_skin_integration.py
test_code_skew.py	fix(gateway): refuse model switch on stale checkout to avoid env_float ImportError	2026-06-24 04:16:54 +05:30
test_ctx_halving_fix.py
test_dashboard_sidecar_close_on_disconnect.py	fix(dashboard): hide sidecar sessions from history (#49269 )	2026-06-19 18:06:38 -04:00
test_delegate_cascade_49148.py	fix(agent): stop delegate cascade from deleting the parent session	2026-06-21 12:09:16 -07:00
test_desktop_electron_pin.py	fix(desktop): resolve electronDist dynamically + self-heal blocked installs (supersedes #48081/#48082) (#48091 )	2026-06-17 18:48:35 -05:00
test_desktop_mac_entitlements.py
test_dispatch_session_id.py	fix(dispatch): forward session_id into registry.dispatch (#28479 )	2026-06-14 00:27:59 -04:00
test_empty_model_fallback.py
test_empty_session_hygiene.py	fix: in-memory transcript blocks empty-session prune	2026-06-10 17:37:34 -07:00
test_env_loader_secret_sources.py
test_evidence_store.py
test_fast_safe_load.py	perf(startup): parse config + plugin manifests with libyaml CSafeLoader (#54486 )	2026-06-28 15:38:39 -07:00
test_gateway_streaming_nested_config.py
test_get_tool_definitions_cache_isolation.py
test_hermes_bootstrap.py	revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853 )	2026-06-27 15:59:00 -07:00
test_hermes_constants.py	fix(windows): cover remaining console-flash spawn legs (#54417 )	2026-06-28 13:49:08 -07:00
test_hermes_home_profile_warning.py
test_hermes_logging.py	fix(logging): suppress Windows lock timeout tracebacks	2026-06-28 22:35:56 -07:00
test_hermes_state.py	fix(state): periodically merge FTS5 segments to curb write-lock contention	2026-07-01 14:09:15 +05:30
test_hermes_state_compression_locks.py
test_hermes_state_wal_fallback.py
test_honcho_client_concurrency.py
test_honcho_client_config.py
test_honcho_session_context.py
test_honcho_startup_fail_open.py
test_install_diverged_update.py	test(installer): cover diverged managed-clone recovery in install scripts	2026-06-30 20:11:01 +07:00
test_install_lockfile_churn.py	fix(install): discard managed lockfile churn before stashing	2026-06-25 23:49:11 -07:00
test_install_no_initial_commit.py
test_install_ps1_native_stderr_eap.py	fix(install): fail fast when uv venv genuinely fails under relaxed EAP	2026-06-18 22:11:35 +05:30
test_install_ps1_python_fallback_venv.py	test(installer): lock Python-fallback propagation into the venv stage (#50769 )	2026-06-23 21:33:08 -07:00
test_install_ps1_uv_powershell_host.py	test(install): lock uv installer to a resolved PowerShell host	2026-06-18 16:26:34 +07:00
test_install_sh_browser_install.py	test(install): track run_with_timeout extraction after #39219 refactor (#54185 )	2026-06-28 03:58:01 -07:00
test_install_sh_install_method_stamp.py	fix(update): scope install-method stamp to the code tree, not $HERMES_HOME (#48188 )	2026-06-18 14:14:41 +10:00
test_install_sh_node_global_prefix.py	fix(hermes): heal broken managed Node tree instead of PATH fallback	2026-06-26 20:10:20 +05:30
test_install_sh_pythonpath_sanitization.py
test_install_sh_root_fhs_uv_python_path.py
test_install_sh_setup_wizard_tty_probe.py
test_install_sh_symlink_stomp.py
test_install_sh_termux_network_prereqs.py
test_install_unmerged_index.py	fix(install): discard managed lockfile churn before stashing	2026-06-25 23:49:11 -07:00
test_ipv4_preference.py
test_lazy_session_regressions.py	fix(gateway): surface retry hint instead of silently dropping turn after /stop (#31884 )	2026-06-24 23:51:31 +05:30
test_lint_config.py
test_live_system_guard_self_test.py
test_mcp_serve.py
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py
test_minisweagent_path.py
test_model_forces_max_completion_tokens.py
test_model_picker_scroll.py
test_model_tools.py	feat(moa): expose MoA presets as selectable virtual models (#46081 )	2026-06-25 13:52:06 -07:00
test_model_tools_async_bridge.py
test_ollama_num_ctx.py	test(vision): cover Ollama /api/show vision capability routing (#54511 )	2026-06-28 22:52:59 -07:00
test_output_cap_parsing.py	fix(agent): stop over-cap max_tokens 400s from death-looping into compression (#55570 )	2026-06-30 03:26:41 -07:00
test_package_json_lazy_deps.py
test_packaging_metadata.py	feat(mcp-catalog): add official Unreal Engine 5.8 MCP server	2026-06-18 09:16:40 -07:00
test_plugin_skills.py
test_plugin_utils.py
test_process_loop_event_loop_warning.py
test_profile_isolation_runtime.py	test(profile): two-profile regression suite + preserve skills_hub monkeypatch seam	2026-06-30 15:30:06 -07:00
test_project_metadata.py	fix(memory): lazy-install supermemory + mem0 SDKs like honcho/hindsight	2026-06-29 00:25:36 -07:00
test_retry_utils.py	fix: handle named custom providers and Z.AI overload retries	2026-06-25 00:17:17 -07:00
test_run_tests_parallel.py	fix(tests): bare pytest flags pass through run_tests.sh without a '--' separator (#54008 )	2026-06-27 22:43:26 -07:00
test_sanitize_tool_error.py
test_setup_temporary_outputs.py	refactor(ci): rewrite docker tests to check built container	2026-06-26 19:15:18 -07:00
test_slash_worker_watchdog.py
test_sql_injection.py
test_stale_utils_module_import.py	fix(gateway): refuse model switch on stale checkout to avoid env_float ImportError	2026-06-24 04:16:54 +05:30
test_state_db_malformed_repair.py	fix(state): detect and repair FTS write corruption that silently drops gateway history (#52798 )	2026-06-25 21:18:41 -07:00
test_subprocess_home_isolation.py	fix: make profile subprocess HOME policy explicit	2026-06-14 03:20:21 -07:00
test_termux_all_extra_compat.py
test_timezone.py
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_transform_llm_output_hook.py
test_transform_tool_result_hook.py
test_tui_gateway_loop_noise.py	fix(tui_gateway): suppress WS peer-hangup teardown error flood (#50005 ) (#54126 )	2026-06-28 02:35:01 -07:00
test_tui_gateway_queue_on_busy.py	fix(tui_gateway): queue mid-turn prompts instead of dropping them on a busy retry	2026-06-25 12:29:49 -05:00
test_tui_gateway_server.py	fix(tui_gateway): reject negative truncate_before_user_ordinal to prevent silent history loss	2026-07-01 01:52:58 -07:00
test_tui_gateway_ws.py	fix(tui): start MCP discovery for websocket sessions	2026-06-28 04:14:12 -07:00
test_tui_mcp_late_refresh.py	fix(tui): refresh tool snapshot when MCP discovery lands after agent build (#48403 )	2026-06-18 05:41:23 -07:00
test_utils_truthy_values.py
test_web_server.py	test(web_server): assert ws-ping invariant, not frozen 20.0 literal	2026-06-30 03:11:13 -07:00
test_wheel_locales_e2e.py
test_windows_subprocess_no_window_flags.py	test: make windows no-window-flag assertions immune to update-check daemon	2026-06-30 01:35:55 -05:00
test_yaml_indent_consistency_31999.py	fix(utils): unify YAML list indent across all config writers (#31999 )	2026-06-25 23:27:44 +05:30
test_yuanbao_integration.py
test_yuanbao_markdown.py
test_yuanbao_pipeline.py	feat(Yuanbao): support wechat forward msg (#43508 )	2026-06-12 02:06:47 -07:00
test_yuanbao_proto.py
test_yuanbao_shutdown.py