feat(tools): progressive tool disclosure for MCP and plugin tools
Adds Tool Search, a structured-tools progressive-disclosure layer that
replaces MCP and non-core plugin tools in the model-visible tools array
with three bridge tools (tool_search / tool_describe / tool_call) when
the deferrable surface would consume more than a configurable percentage
of the active model's context window. Core Hermes tools are never deferred.
Default mode is 'auto' with a 10% context threshold, so small toolsets
pay no overhead. Set tools.tool_search.enabled to 'on' to force or 'off'
to disable.
Design carefully reflects the OpenClaw production failure modes
documented in the openclaw-tool-search-report:
- Core tools never defer (toolsets._HERMES_CORE_TOOLS). Addresses the
'tools silently missing from isolated cron turns' regression class
(openclaw#84141) by construction: there is no code path that can
drop a core tool.
- Catalog is stateless across turns — rebuilt from the live tool-defs
list on every assembly. No session-keyed Map that can drift out of
sync with the registry.
- tool_call unwraps the bridge call before any hook fires, so plugin
pre/post hooks, guardrails, approval flows, and the activity feed
all see the underlying tool name, not the bridge (addresses
openclaw#85588 and the verbose-mode complaint on openclaw#79823).
- The unwrap happens in both the parallel and sequential paths of
agent/tool_executor.py and also in handle_function_call, so direct
callers (sandboxed code, eval harnesses) are covered too.
- Bridge tools cannot invoke each other (recursion guard) and cannot
invoke core tools (those must be called directly).
- Tools mode only — no JS-sandbox code-mode. Keeps the surface small.
- Token estimation via cheap char/4 heuristic; precision isn't needed
for the threshold decision.
Files:
- tools/tool_search.py — new module (BM25 retrieval, classification,
threshold gate, bridge dispatch, unwrap helper).
- tests/tools/test_tool_search.py — 35 tests including the OpenClaw
#84141 regression guard.
- model_tools.py — wires assembly into _compute_tool_definitions as the
final step, adds skip_tool_search_assembly kwarg so the bridge can
see the real catalog, dispatches the three bridge tools.
- agent/tool_executor.py — unwraps tool_call in both parallel and
sequential parsing loops so checkpointing, guardrails, plugin hooks,
and tool-progress callbacks all observe the underlying tool name.
- hermes_cli/config.py — DEFAULT_CONFIG['tools']['tool_search'] block.
- website/docs/user-guide/features/tool-search.md — user docs.
Validation:
- 35/35 new tests pass.
- Existing tool/registry/model_tools/config/coercion/executor tests
(82 + 74 + small adjacents) green.
- Live E2E: 20 fake MCP tools registered, get_tool_definitions returns
3 bridges, tool_search returns top 3 hits, tool_describe returns
full schema, tool_call dispatches to the real underlying handler
and the underlying result is what the model sees.
- Reserved-name recursion guard verified live.
- Core-tool refusal via tool_call verified live.
This commit is contained in:
parent
73d73f1f0d
commit
369075dc95
6 changed files with 1453 additions and 1 deletions
|
|
@ -1785,6 +1785,38 @@ DEFAULT_CONFIG = {
|
|||
"mode": "project",
|
||||
},
|
||||
|
||||
# Tool Search (progressive disclosure for large tool surfaces).
|
||||
# When the model is connected to many MCP servers or non-core plugin
|
||||
# tools, their JSON schemas can consume a substantial fraction of the
|
||||
# context window on every turn. When enabled, those tools are replaced
|
||||
# in the model-facing tools array with three bridge tools —
|
||||
# tool_search / tool_describe / tool_call — and surfaced on demand.
|
||||
#
|
||||
# Core Hermes tools (terminal, read_file, write_file, patch,
|
||||
# search_files, todo, memory, browser_*, etc.) are NEVER deferred.
|
||||
# See tools/tool_search.py for full design notes and the
|
||||
# openclaw-tool-search-report PDF in this PR for the rationale.
|
||||
"tools": {
|
||||
"tool_search": {
|
||||
# "auto" (default) — activate only when deferrable tool schemas
|
||||
# exceed ``threshold_pct`` of the active model's context length,
|
||||
# so small toolsets pay no overhead.
|
||||
# "on" — always activate when there is at least one deferrable
|
||||
# tool. Use when you have many MCP servers and want maximum
|
||||
# token reduction unconditionally.
|
||||
# "off" — disable entirely. Tools-array assembly is a pass-through.
|
||||
"enabled": "auto",
|
||||
# Percentage of context length at which "auto" mode kicks in.
|
||||
# 10 matches the Claude Code default. Range 0..100.
|
||||
"threshold_pct": 10,
|
||||
# When the model calls tool_search without a ``limit`` argument,
|
||||
# how many hits to return. Range 1..max_search_limit.
|
||||
"search_default_limit": 5,
|
||||
# Hard upper bound the model can request via ``limit``. Range 1..50.
|
||||
"max_search_limit": 20,
|
||||
},
|
||||
},
|
||||
|
||||
# Logging — controls file logging to ~/.hermes/logs/.
|
||||
# agent.log captures INFO+ (all agent activity); errors.log captures WARNING+.
|
||||
"logging": {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue