hermes-agent/tests
Tranquil-Flow e7562c394f fix(gateway): skip cross-process guard on session_id switch under same session_key (#54947)
The cross-process coherence guard (#45966) compares the session's
on-disk message_count against the snapshot stored next to the cached
agent, and rebuilds the agent on a mismatch.  The guard is correct
when the cache snapshot and the live count both refer to the same
DB row.  But the agent cache is keyed by session_key, which can
group multiple conversation threads (different session_ids) under
the same key — and the message_count values belong to DIFFERENT
DB rows.

When the user switches from session A to session B under the same
session_key, the cache hit returns A's cached agent.  The guard then
compares A's snapshot count (A.message_count) against B's live count
(B.message_count) — they are NEVER equal because they track
different conversations — and invalidates the cache.  Every session
switch busts the prompt cache and forces a fresh agent build.  The
post-turn re-baseline (#46237) made it worse: it reads the live
count from the CURRENT session_entry.session_id, so each switch
overwrites the original snapshot with the new session's count,
causing the very next switch BACK to the original session to fire
the guard again.

This is the bug from #54947 (P0, sweeper:risk-session-state,
sweeper:risk-caching).

Fix:
  * Record the snapshot's session_id alongside the message_count in
    the cache tuple: (agent, sig, mc, session_id) — a 4-tuple.  The
    cache build at the AIAgent construction site stores the active
    session_id.
  * The cache-hit guard skips the cross-process count comparison
    when the active session_id differs from the snapshot's
    session_id — the comparison is meaningless across different DB
    rows, so the agent is REUSED without invalidation.  The cross-
    process guard still fires when the session_id matches and the
    live count differs (genuine cross-process write on the SAME
    session).
  * _refresh_agent_cache_message_count checks the snapshot's
    session_id: when it differs from the current session_id, the
    snapshot is intentionally left untouched (overwriting it would
    corrupt the original conversation's baseline and cause the
    switch-back to fire the guard).  The legacy 3-tuple shape (no
    session_id) is still re-baselined as before.
  * Backward-compat:
      - 2-tuple (agent, sig) — unchanged, opts out of the guard.
      - 3-tuple (agent, sig, mc) — unchanged behavior, standard
        cross-process check.
      - pending sentinel — unchanged, untouched by re-baseline.
      - new 4-tuple (agent, sig, mc, session_id) — full session_id-
        aware guard with skip on mismatch.

Tests:
  * tests/gateway/test_session_id_cache_coherence.py — 7 tests
    covering L1-L5 from LAYERS.md:
      - L1 session_id switch must REUSE
      - L2 cache tuple records snapshot's session_id
      - L3 re-baseline skips when session_id differs
      - L4 same-session_id turns still re-baseline (#46237 holds)
      - L5 legacy 2-tuples and pending sentinels untouched
      - legacy 3-tuple (no session_id) still guarded (#45966 holds)
      - 3-tuple transitions to 3-tuple (not 4-tuple) on re-baseline

No regressions in 70 existing tests in test_agent_cache.py or 137
related session tests.  Co-authored with #52197 (deferred cleanup
of evicted agents); both fixes compose cleanly.
2026-07-01 02:29:24 -07:00
..
acp fix: bound threat-pattern/FTS5 regex input and cover V4A Move-File edits 2026-07-01 01:05:28 -07:00
acp_adapter
agent fix(cache): stop verification-loop synthetic nudges from persisting (#56194) 2026-07-01 02:26:06 -07:00
ci fix(ci): classify should default to no MCP 2026-06-23 10:32:27 -07:00
cli fix(cli): route /sessions and /history through prompt_toolkit-safe printing 2026-07-01 01:25:43 -07:00
computer_use feat(computer_use): disable cua-driver telemetry by default, add opt-in (#50842) 2026-06-22 09:57:16 -07:00
cron security(cron): fail closed in scheduler backstop when validator errors 2026-07-01 14:23:01 +05:30
docker fix(s6): dot-prefix gateway staging dir so svscan ignores it mid-build (#54834) 2026-06-29 21:33:00 +10:00
e2e fix(gateway): route SessionDB calls through AsyncSessionDB 2026-06-29 15:51:57 -07:00
fakes
fixtures/plugins/example-dashboard/dashboard
gateway fix(gateway): skip cross-process guard on session_id switch under same session_key (#54947) 2026-07-01 02:29:24 -07:00
hermes_cli test(runtime): pin Anthropic OAuth → /v1/messages routing across runtime branches 2026-07-01 02:18:56 -07:00
hermes_state fix(state): exclude delegate/branch/tool children from resume walk + reconcile salvaged fixes 2026-06-25 16:29:09 -07:00
honcho_plugin feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335) 2026-06-22 19:16:47 -05:00
integration feat(web_extract): truncate-and-store instead of LLM summarization (#54843) 2026-06-29 10:00:49 -07:00
openviking_plugin feat(openviking): add full recall prefetch policy 2026-06-24 18:53:49 +05:30
plugins fix(teams-pipeline): reject dot-only recording display_name 2026-07-01 02:03:48 -07:00
providers fix(models): pass model.base_url to fetch_models in /model picker 2026-06-16 13:09:40 -07:00
run_agent fix(agent): prefer late-completing real result over timeout message (review) 2026-07-01 14:56:52 +05:30
scripts revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00
skills feat(skills): add cloudflare-temporary-deploy optional skill (#50849) 2026-06-22 12:14:30 -07:00
stress
tools fix(tools): stop _strategy_exact emitting overlapping matches (#56211) 2026-07-01 02:13:13 -07:00
tui_gateway fix(tui_gateway): drop emit-only session.info from _LONG_HANDLERS 2026-06-30 03:11:13 -07:00
website
__init__.py
conftest.py feat(managed-scope): add managed_scope module (resolver, loaders, key helpers) 2026-06-19 07:46:33 -07:00
run_interrupt_test.py
test_account_usage.py
test_assistant_ui_tap_compat.py test(deps): guard @assistant-ui cluster on one tap version 2026-06-15 11:55:02 -04:00
test_atomic_replace_symlinks.py fix(utils): copy fallback for atomic replace across devices (#43852) 2026-06-13 14:50:05 -07:00
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_bitwarden_secrets.py
test_cli_file_drop.py
test_cli_manual_compress.py
test_cli_skin_integration.py
test_code_skew.py fix(gateway): refuse model switch on stale checkout to avoid env_float ImportError 2026-06-24 04:16:54 +05:30
test_ctx_halving_fix.py
test_dashboard_sidecar_close_on_disconnect.py fix(dashboard): hide sidecar sessions from history (#49269) 2026-06-19 18:06:38 -04:00
test_delegate_cascade_49148.py fix(agent): stop delegate cascade from deleting the parent session 2026-06-21 12:09:16 -07:00
test_desktop_electron_pin.py fix(desktop): resolve electronDist dynamically + self-heal blocked installs (supersedes #48081/#48082) (#48091) 2026-06-17 18:48:35 -05:00
test_desktop_mac_entitlements.py
test_dispatch_session_id.py fix(dispatch): forward session_id into registry.dispatch (#28479) 2026-06-14 00:27:59 -04:00
test_empty_model_fallback.py
test_empty_session_hygiene.py
test_env_loader_secret_sources.py
test_evidence_store.py
test_fast_safe_load.py perf(startup): parse config + plugin manifests with libyaml CSafeLoader (#54486) 2026-06-28 15:38:39 -07:00
test_gateway_streaming_nested_config.py
test_get_tool_definitions_cache_isolation.py
test_hermes_bootstrap.py revert(windows): roll back terminal-popup PRs #53791 #53810 #53829 (#53853) 2026-06-27 15:59:00 -07:00
test_hermes_constants.py fix(windows): cover remaining console-flash spawn legs (#54417) 2026-06-28 13:49:08 -07:00
test_hermes_home_profile_warning.py
test_hermes_logging.py fix(logging): suppress Windows lock timeout tracebacks 2026-06-28 22:35:56 -07:00
test_hermes_state.py fix(state): periodically merge FTS5 segments to curb write-lock contention 2026-07-01 14:09:15 +05:30
test_hermes_state_compression_locks.py
test_hermes_state_wal_fallback.py
test_honcho_client_concurrency.py
test_honcho_client_config.py
test_honcho_session_context.py
test_honcho_startup_fail_open.py
test_install_diverged_update.py test(installer): cover diverged managed-clone recovery in install scripts 2026-06-30 20:11:01 +07:00
test_install_lockfile_churn.py fix(install): discard managed lockfile churn before stashing 2026-06-25 23:49:11 -07:00
test_install_no_initial_commit.py
test_install_ps1_native_stderr_eap.py fix(install): fail fast when uv venv genuinely fails under relaxed EAP 2026-06-18 22:11:35 +05:30
test_install_ps1_python_fallback_venv.py test(installer): lock Python-fallback propagation into the venv stage (#50769) 2026-06-23 21:33:08 -07:00
test_install_ps1_uv_powershell_host.py test(install): lock uv installer to a resolved PowerShell host 2026-06-18 16:26:34 +07:00
test_install_sh_browser_install.py test(install): track run_with_timeout extraction after #39219 refactor (#54185) 2026-06-28 03:58:01 -07:00
test_install_sh_install_method_stamp.py fix(update): scope install-method stamp to the code tree, not $HERMES_HOME (#48188) 2026-06-18 14:14:41 +10:00
test_install_sh_node_global_prefix.py fix(hermes): heal broken managed Node tree instead of PATH fallback 2026-06-26 20:10:20 +05:30
test_install_sh_pythonpath_sanitization.py
test_install_sh_root_fhs_uv_python_path.py
test_install_sh_setup_wizard_tty_probe.py
test_install_sh_symlink_stomp.py
test_install_sh_termux_network_prereqs.py
test_install_unmerged_index.py fix(install): discard managed lockfile churn before stashing 2026-06-25 23:49:11 -07:00
test_ipv4_preference.py
test_lazy_session_regressions.py fix(gateway): surface retry hint instead of silently dropping turn after /stop (#31884) 2026-06-24 23:51:31 +05:30
test_lint_config.py
test_live_system_guard_self_test.py
test_mcp_serve.py
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minimax_oauth.py
test_minisweagent_path.py
test_model_forces_max_completion_tokens.py
test_model_picker_scroll.py
test_model_tools.py feat(moa): expose MoA presets as selectable virtual models (#46081) 2026-06-25 13:52:06 -07:00
test_model_tools_async_bridge.py
test_ollama_num_ctx.py test(vision): cover Ollama /api/show vision capability routing (#54511) 2026-06-28 22:52:59 -07:00
test_output_cap_parsing.py fix(agent): stop over-cap max_tokens 400s from death-looping into compression (#55570) 2026-06-30 03:26:41 -07:00
test_package_json_lazy_deps.py
test_packaging_metadata.py feat(mcp-catalog): add official Unreal Engine 5.8 MCP server 2026-06-18 09:16:40 -07:00
test_plugin_skills.py
test_plugin_utils.py
test_process_loop_event_loop_warning.py
test_profile_isolation_runtime.py test(profile): two-profile regression suite + preserve skills_hub monkeypatch seam 2026-06-30 15:30:06 -07:00
test_project_metadata.py fix(memory): lazy-install supermemory + mem0 SDKs like honcho/hindsight 2026-06-29 00:25:36 -07:00
test_retry_utils.py fix: handle named custom providers and Z.AI overload retries 2026-06-25 00:17:17 -07:00
test_run_tests_parallel.py fix(tests): bare pytest flags pass through run_tests.sh without a '--' separator (#54008) 2026-06-27 22:43:26 -07:00
test_sanitize_tool_error.py
test_setup_temporary_outputs.py refactor(ci): rewrite docker tests to check built container 2026-06-26 19:15:18 -07:00
test_slash_worker_watchdog.py
test_sql_injection.py
test_stale_utils_module_import.py fix(gateway): refuse model switch on stale checkout to avoid env_float ImportError 2026-06-24 04:16:54 +05:30
test_state_db_malformed_repair.py fix(state): detect and repair FTS write corruption that silently drops gateway history (#52798) 2026-06-25 21:18:41 -07:00
test_subprocess_home_isolation.py fix: make profile subprocess HOME policy explicit 2026-06-14 03:20:21 -07:00
test_termux_all_extra_compat.py
test_timezone.py
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor.py
test_trajectory_compressor_async.py
test_transform_llm_output_hook.py
test_transform_tool_result_hook.py
test_tui_gateway_loop_noise.py fix(tui_gateway): suppress WS peer-hangup teardown error flood (#50005) (#54126) 2026-06-28 02:35:01 -07:00
test_tui_gateway_queue_on_busy.py fix(tui_gateway): queue mid-turn prompts instead of dropping them on a busy retry 2026-06-25 12:29:49 -05:00
test_tui_gateway_server.py fix(tui_gateway): reject negative truncate_before_user_ordinal to prevent silent history loss 2026-07-01 01:52:58 -07:00
test_tui_gateway_ws.py fix(tui): start MCP discovery for websocket sessions 2026-06-28 04:14:12 -07:00
test_tui_mcp_late_refresh.py fix(tui): refresh tool snapshot when MCP discovery lands after agent build (#48403) 2026-06-18 05:41:23 -07:00
test_utils_truthy_values.py
test_web_server.py test(web_server): assert ws-ping invariant, not frozen 20.0 literal 2026-06-30 03:11:13 -07:00
test_wheel_locales_e2e.py
test_windows_subprocess_no_window_flags.py test: make windows no-window-flag assertions immune to update-check daemon 2026-06-30 01:35:55 -05:00
test_yaml_indent_consistency_31999.py fix(utils): unify YAML list indent across all config writers (#31999) 2026-06-25 23:27:44 +05:30
test_yuanbao_integration.py
test_yuanbao_markdown.py
test_yuanbao_pipeline.py feat(Yuanbao): support wechat forward msg (#43508) 2026-06-12 02:06:47 -07:00
test_yuanbao_proto.py
test_yuanbao_shutdown.py