hermes-agent/website/docs
Teknium 543d305bbb
feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756)
MoA per-turn latency is dominated by advisor GENERATION: turn wall time
correlates ~0.88 with output tokens and ~-0.03 with input tokens (measured over
52 turns). Each turn waits for the slowest advisor to finish writing, and
advisors were uncapped — writing multi-thousand-token essays the aggregator
only needs the gist of.

Add an opt-in per-preset reference_max_tokens knob (mirrors reference_temperature)
that caps ADVISOR output only; the acting aggregator is never capped. Default
None = uncapped, so existing presets are byte-for-byte unchanged (no regression).
Wired through both MoA execution paths (MoAChatCompletions.create and
aggregate_moa_context).

E2E: same task, closed preset uncapped vs reference_max_tokens=600 -> 59s to 33s
(~44% faster), final answer identical/correct.

- hermes_cli/moa_config.py: _coerce_int_or_none helper + reference_max_tokens
  in _normalize_preset/_default_preset/flattened view
- agent/moa_loop.py: read preset.reference_max_tokens, pass to reference fan-out
- agent/conversation_loop.py: pass reference_max_tokens on the per-turn path
- tests + docs
2026-07-02 00:16:35 -07:00
..
developer-guide revert: back out prompt_caching.enabled toggle (#56105) for re-evaluation (#56126) 2026-07-01 00:20:32 -07:00
getting-started docs: create dev venv outside the source tree (root-cause fix for #7779) (#54862) 2026-06-29 10:00:37 -07:00
guides feat(vertex): add Google Vertex AI provider for Gemini (OAuth2) 2026-07-01 05:25:33 -07:00
integrations feat(vertex): add Google Vertex AI provider for Gemini (OAuth2) 2026-07-01 05:25:33 -07:00
reference fix(telegram): cap initialize() with per-attempt timeout so unreachable fallback IPs can't hang startup 2026-07-01 05:07:10 -07:00
user-guide feat(moa): add reference_max_tokens to cap advisor output and cut turn latency (#56756) 2026-07-02 00:16:35 -07:00
index.mdx feat(docs): clarify platform support 2026-06-26 11:37:56 -07:00
user-stories.mdx docs(website): add User Stories and Use Cases collage page (#18282) 2026-04-30 23:56:59 -07:00