hermes-agent

Author	SHA1	Message	Date
teknium1	eb99f82ce4	fix(browser): surface launch diagnostics when debug browser never opens the CDP port Follow-up to the salvaged early-exit retry fix (#35617): the debug-browser launch path was fire-and-forget (stderr to DEVNULL, no logging), so every platform failure — Windows singleton forward to an existing instance, bad profile dir, missing shared libraries, policy blocks — collapsed into the same unactionable 'port 9222 isn't responding yet' message and debug reports contained nothing. - launch_chrome_debug() returns a structured ChromeDebugLaunch with per-candidate attempts (state, exit code, stderr tail) - browser stderr is captured to <hermes_home>/chrome-debug/launch-stderr.log - clean exit (code 0) without the port opening is detected as Chromium's single-instance forward and produces a targeted user hint to close all running instances of that browser - crash exits surface the stderr tail (e.g. missing libnspr4.so) - every spawn/exit is logged to agent.log so hermes debug share captures it - CLI (/browser connect) and TUI/desktop (browser.manage) both print the hint	2026-07-03 01:05:22 -07:00
LeonSGP43	c74f093523	fix(browser): retry next candidate when debug launch exits early	2026-07-03 01:05:22 -07:00
Mibayy	ce9aa869fc	feat(commands): /compact alias + --preview/--dry-run flags for /compress (#3243 salvage) Salvaged from PR #3243 by @Mibayy, reimplemented against current main (the original diff targeted a removed gateway/run.py handler). - /compact is now a first-class alias of /compress (CLI, gateway, Telegram/Slack/Discord command lists, autocomplete) — also fixes the dangling '/compact' references in gateway error messages (gateway/run.py context-exhausted banners). - --preview / --dry-run: report what WOULD be compressed (message counts, token estimate, 'here [N]' boundary) without touching the transcript. Flags coexist with the existing 'here [N]' / focus-topic args on both the CLI and gateway surfaces via shared pure helpers in hermes_cli/partial_compress.py. - --aggressive (LLM-free hard truncation) is intentionally NOT implemented: it would need its own transcript-persistence branch outside the guarded _compress_context rotation machinery (#44794 data-loss class). The flag is recognized and returns an explanatory message pointing at '/compress here [N]' and /undo instead of being mis-parsed as a focus topic. - locales: gateway.compress.aggressive_unsupported added to all 16 catalogs (parity test enforced). - release.py: AUTHOR_MAP entry for contributor credit.	2026-07-02 05:10:31 -07:00
Teknium	3f2a56d1a4	fix(cli): reliable interrupts, bounded exit, and exit feedback (#57000 ) Three CLI reliability fixes: 1. Interrupt reliability: chat() only re-queued the user's interrupt message when the turn result carried interrupted=True. When the agent thread raced past its last interrupt check (or finished) before the interrupt landed, the message was silently dropped — and the stale _interrupt_requested flag left on the agent instantly aborted the NEXT turn. Un-acknowledged interrupt messages are now re-queued as the next turn and the stale flag is cleared (only when the agent thread actually exited). The clarify-race path also parks the message in _pending_input instead of dropping it. 2. Slow exit (5+ min): stdlib ThreadPoolExecutor workers are non-daemon and joined unconditionally by concurrent.futures' atexit hook — even after shutdown(wait=False). One wedged tool worker (abandoned after interrupt/timeout) held the process open forever. Promoted async_delegation's daemon executor to a shared tools/daemon_pool module and adopted it in tool_executor (concurrent tool batches), memory_manager (background sync), delegate_tool (child timeout wrapper + batch fan-out), and skills_hub (source fan-out). Added a 30s exit watchdog (HERMES_EXIT_WATCHDOG_S) armed at _run_cleanup start as a backstop for wedged cleanup steps. 3. Exit jank: after prompt_toolkit tears down the input/status bars the terminal sat silent for the whole cleanup window, looking hung. Print 'Shutting down… (finalizing session)' immediately at exit start. E2E: live PTY interrupt of a foreground 'sleep 120' terminal tool now aborts in ~1s and the typed message runs as the next turn; wedged-worker + wedged-cleanup subprocess exits in 5.8s (watchdog) instead of hanging.	2026-07-02 04:20:43 -07:00
srojk34	a76aa6198c	fix(cli): flush un-persisted messages before /resume and /branch end the old session compress_context() and /new already flush un-persisted messages before calling end_session() (fixed in #47202), but /resume and /branch still call end_session() directly. When a turn is interrupted mid-flight and the user immediately runs /resume or /branch, messages generated during that turn have not yet been written to state.db and are silently lost on session rotation. Add the same best-effort _flush_messages_to_session_db() call before end_session() in both _handle_resume_command and _handle_branch_command, mirroring the pattern established in cli.py:new_session(). Regression tests verify the flush is called when an agent is present.	2026-07-01 17:08:55 +05:30
Teknium	74809b4e94	fix(cli): reap dead-locked worktrees so .worktrees/ can't grow unbounded (#56288 ) hermes -w locks each worktree (reason 'hermes pid=<pid>'). git worktree remove --force (single -f) refuses a locked tree, so a crashed session's lock was never released and its worktree accumulated forever — a real contributor to .worktrees/ bloat. _prune_stale_worktrees now classifies each lock via _worktree_lock_is_live: a live-owner pid is skipped at any age; a dead-owner (or foreign) lock is unlocked first so the aggressive age-based cleanup can actually reap it. The >72h reap tier is kept (that cleanup is intentional) but now guarded so dirty/unpushed work is preserved, and branch deletion is gated on git worktree remove succeeding. New fail-safe helpers _worktree_is_dirty and _worktree_lock_is_live (pid liveness via gateway.status._pid_exists, Windows-safe).	2026-07-01 03:43:20 -07:00
YLChen-007	e23f723389	fix: make streaming reasoning-tag filter case-insensitive The streaming think-tag suppressors in cli.py (_stream_delta) and gateway/stream_consumer.py (_filter_and_accumulate) matched tag names with case-sensitive str.find(), so only the exact-case literals in the tag tuples were caught. Mixed-case variants a model may emit — <Think>, <ThInK>, <REASONING>, <Thought> — slipped through and leaked raw reasoning into the user-visible stream. Match against a lowercased view of the buffer with lowercased tag names at all three sites (open-tag boundary search, partial-tag hold-back, close-tag search) in both paths. Only KNOWN tag names are matched — no substring matching — and the block-boundary gating that protects prose mentions of <think> is preserved. - 6 parametrized case-insensitive regression tests in each of tests/gateway/test_stream_consumer.py and tests/cli/test_stream_delta_think_tag.py. Salvaged from PR #27289 by @YLChen-007.	2026-07-01 03:25:02 -07:00
redactdeveloper	b94397fe76	fix(cli): route /sessions and /history through prompt_toolkit-safe printing Bare print() output is swallowed by patch_stdout while an interactive prompt_toolkit Application owns the terminal, so /sessions and /history rendered nothing. Route those emissions through _cprint (prompt_toolkit's native renderer) when an app is running, and fall back to print otherwise. Fixes #36815	2026-07-01 01:25:43 -07:00
Tranquil-Flow	c1a0c0ada7	fix(cli): re-land interrupt_queue drain so finished turns flush stray input The CLI routes user input typed while the agent is running into ``_interrupt_queue`` (separate from ``_pending_input``) so the explicit interrupt path can opt to deliver them as a single combined message. That path only drains the queue when ``busy_input_mode == "interrupt"`` AND a ``pending_message`` was acknowledged. If the agent's turn finishes naturally (no interrupt fires), any messages typed during the turn stay stuck in ``_interrupt_queue`` forever. Subsequent ``Enter`` presses route input to the same blocked queue and the CLI appears to hang. Original report: lunarnexus in The fix restores the post-turn drain that was originally part of drain off as "worth its own review" and never re-landed it; the user- visible regression is that any non-interrupt-mode user typing during a turn is silently dropped. Implementation: extract the drain to a small helper ``_drain_interrupt_queue_to_pending_input`` matching the existing ``_maybe_continue_goal_after_turn`` style. ``process_loop``'s ``finally`` block calls it once per turn after the status-line refresh and before goal continuation (so re-queued user input preempts an auto-continuation prompt). The helper swallows ``Exception`` so it can never break the main loop. Addresses #20271.	2026-07-01 00:12:32 -07:00
Neo	c969090878	fix(cli): clear input-blocking overlays when interrupting a running agent Interrupting the agent while an approval/clarify/sudo/secret prompt is up left the overlay state dict set with no thread servicing it. The prompt's worker thread is torn down on interrupt, but read_only (gated on _command_running) plus the keypress filter kept the CLI input locked until the prompt's own timeout expired — the terminal appeared frozen. Drain and clear all four input-blocking overlays on interrupt via a single helper (_clear_active_overlays_for_interrupt): approval -> deny, clarify/sudo/secret -> cancel, each guarded so a dead queue can't block the others; sudo restores the pre-modal draft. Wired into all three interrupt paths — new-message interrupt, Ctrl+C, and Ctrl+Q. Blocking overlays now clear AND fall through so one keypress both clears a stale overlay and interrupts a still-running agent; the /model picker and slash-confirm foreground prompts keep their cancel-and-return behavior. Closes #13618.	2026-06-30 04:49:29 -07:00
teknium1	1cae1bd0de	test(cli): deterministically join bg worker thread instead of polling deadline test_background_task_registers_thread_local_approval_callbacks polled a 2s wall-clock deadline waiting for the background daemon thread to pop its entry from _background_tasks. Under loaded CI the thread's finally-block cleanup could lag the deadline, flaking the final 'assert not cli._background_tasks'. Join the actual worker thread (timeout=10) so the wait ends exactly when the thread finishes.	2026-06-30 04:23:03 -07:00
Teknium	52a853f5c3	fix(test): pin monotonic clock in spinner-elapsed test to fix CI flake (#54203 ) test_spinner_elapsed_format_is_fixed_width_to_reduce_wrap_jitter derived _tool_start_time from the live time.monotonic() clock (now - 65.2 / now - 9.2). monotonic()'s epoch is arbitrary — on a host where monotonic() < 65.2 (fresh subprocess on a freshly-booted CI runner) the start time went negative, the (t0 > 0) guard in _render_spinner_text() dropped the '(elapsed)' suffix, and short.split('(',1)[1] raised IndexError: list index out of range. Deterministic given a small clock, so it would keep flaking, not clear on rerun. Pin time.monotonic to a fixed 1000.0 and offset _tool_start_time from it so both the <60s and >=60s paths always render the elapsed suffix regardless of the runner's monotonic epoch. Pre-existing main flake (surfaced in CI test slice 1/8).	2026-06-28 04:16:25 -07:00
teknium1	64972b6403	fix(config): canonicalize model.name/model.model to model.default (#34500 ) A custom_providers config that names the model under model.name (or model.model) resolved to an empty model, so the API request went out with model= — HTTP 400 from OpenAI-compatible backends. Display paths (hermes status/dump) already read model.name and showed the model, making the failure silent. The model id was read via 'default or model' at ~14 independent sites (cli, gateway, cron, curator, oneshot, fallback, profiles, ...), none of which honored 'name'. Rather than patch every site, canonicalize at the single load/save chokepoint: _normalize_root_model_keys() now promotes model.model/model.name -> model.default (precedence default > model > name) and drops the stale alias, so every reader — present and future — sees a populated default and config.yaml is migrated canonical on next save. The gateway, which bypasses load_config(), replays the same normalization in _load_gateway_config(). Co-authored-by: Bartok9 <danielrpike9@gmail.com> Credit: root-cause analysis and fix direction from @Bartok9 (#34502, first) and @v86861062 (#34527).	2026-06-28 02:05:13 -07:00
teknium1	1f72ad9be9	refactor(cli): extract interrupt recovery to a testable helper Pull the #33271 post-interrupt recovery (flush_stdin + _force_full_redraw) out of process_loop's finally block into _recover_terminal_after_interrupt(), and replace the inline-logic-copy tests with ones that exercise the real helper plus a source guard that process_loop still invokes it behind the _last_turn_interrupted gate.	2026-06-28 01:08:09 -07:00
zccyman	f3aaba7f85	fix(cli): recover terminal state after interrupt to prevent raw control sequence freeze When the agent is interrupted during processing, prompt_toolkit's renderer and VT100 input parser can be left in an inconsistent state. CSI 6n cursor position report responses leak as literal text (^[[19;1R) and the terminal stops accepting keyboard input. Fix: in process_loop's finally block, after an interrupted turn: - flush_stdin() to drain stray escape bytes from the OS input buffer - _force_full_redraw() to reset prompt_toolkit's renderer cache Closes #33271	2026-06-28 01:08:09 -07:00
Teknium	3b44a3c8bb	feat(moa): show each reference model's output as a labelled block before the aggregator (#53793 ) When a MoA preset is selected, each reference model's answer now renders in the CLI as a thinking-style block labelled with its source model, BEFORE the aggregator responds — so the mixture-of-agents process is visible instead of a silent pause. The aggregator's response (and its tool actions) follow as normal. Mechanism (shared seam, all surfaces): - MoAChatCompletions/MoAClient take an optional reference_callback and emit 'moa.reference' (index/count/label/text) per reference, then 'moa.aggregating' (aggregator label) once. agent_init wires this to the agent's tool_progress_callback, which every surface already consumes — so the events reach CLI/TUI/desktop/gateway with no new plumbing. - CLI _on_tool_progress renders 'moa.reference' as a labelled '┊ ◇ Reference i/n — <model>' header + a thinking-style preview (reusing _emit_reasoning_ preview), and 'moa.aggregating' as a spinner transition. Display-only; never touches message history (cache-safe). Turn-scoped reference cache: the agent loop calls the facade once per tool-loop iteration, but the advisory message view is identical across iterations within a turn, so references are now run AND displayed once per user turn (keyed by the advisory view's signature) instead of re-running/re-spamming on every iteration. This also cuts reference API cost from O(iterations) back to O(turns). Verified live via interactive PTY on the opus-gpt preset (gpt-5.5 + opus refs): reference blocks render once per turn, labelled by model, before the aggregator; fresh blocks on each new turn; aggregator tool actions still execute. Follow-up: TUI/desktop rich rendering + gateway batched-summary already receive the events via tool_progress_callback; their surface-specific renderers are a separate change.	2026-06-27 12:45:23 -07:00
Teknium	d470ed0c4c	fix(cli): commit tool scrollback lines in verbose mode (non-streaming/MoA) (#53785 ) In the interactive CLI, the aggregator's tool calls under a MoA preset (or any non-streaming model call, e.g. copilot-acp) appeared to overwrite each other instead of building scrollable history. Each tool only updated the transient spinner line; no committed scrollback line was printed. Root cause: persistent tool lines in _on_tool_progress's tool.completed branch were gated on tool_progress_mode in {all, new}, omitting 'verbose'. Streaming models hid the bug because _on_tool_gen_start commits a 'preparing' line per tool during streaming; non-streaming calls (MoA forces _use_streaming=False) never emit that, so under 'verbose' there was no committed line at all — only the self-overwriting spinner. 'verbose' is strictly more than 'all', so it now commits the same scrollback line. Verified live via interactive PTY on the MoA opus-gpt preset: three terminal calls in turn 1 and two in turn 2 each render as separate persistent lines.	2026-06-27 12:29:55 -07:00
HiddenPuppy	b34771fc06	fix(cli): disable prompt_toolkit CPR queries to stop escape-sequence leak (#13870 ) prompt_toolkit's renderer sends ESC[6n cursor-position queries before painting in non-fullscreen mode; the terminal replies ESC[<row>;<col>R. Over SSH/cloudflared tunnels and slow PTYs these replies race past the input parser and land in the display as raw '20;1R21;1R' text, and the pending-CPR future can stall the renderer so the prompt freezes after the agent's final answer. Build the prompt_toolkit output with enable_cpr=False so CPR is marked NOT_SUPPORTED up front and ESC[6n is never sent. This is the root-cause counterpart to the existing input-side _strip_leaked_terminal_responses scrubbing. Vt100_Output.from_pty() does not expose enable_cpr in prompt_toolkit 3.x, so _build_cpr_disabled_output() reproduces its get_size setup and calls the constructor directly; it returns None on any failure so startup falls back to the default output. Verified in a real PTY: baseline emits 1 ESC[6n query, the fix emits 0, banner/UI render identically. Layout is unaffected — with CPR off the renderer sizes the prompt to its preferred height (the same fallback prompt_toolkit uses on any terminal that doesn't answer CPR). Co-authored-by: Hermes Agent <noreply@nousresearch.com>	2026-06-27 04:15:20 -07:00
teknium1	50f6855217	feat(moa): make /moa one-shot only; route preset switching through the model picker /moa no longer does a sticky model switch. It now always runs a single prompt through the default MoA preset and restores the prior model afterward; the whole argument is the prompt (no preset-name matching). To switch to a MoA preset for the session, select it from the model picker, where presets already surface under a virtual Mixture of Agents provider on every model-selection surface. Also fixes #53444: the TUI one-shot only set session[model_override], which the already-built cached agent ignored, so MoA silently never ran and the turn used the original model. The TUI now does a real in-place agent.switch_model() via _apply_model_switch() when a live agent exists (with a proper restore after the turn), and falls back to a model_override for lazy/unbuilt sessions. Removes the redundant sticky-switch branch from the CLI, gateway, and TUI /moa handlers; updates the command description, usage string, and docs.	2026-06-27 03:09:09 -07:00
ethernet	bcc3eb3419	fix(ci): rip out some xdist legacy stuff... how did these ever work??	2026-06-26 19:15:18 -07:00
Brooklyn Nicholson	985350dd85	feat(cli): note background delegate_task dispatch in _on_tool_complete A top-level delegate_task dispatches in the background and re-enters as a fresh turn when done. Print a one-line dispatch-time note — no spinner, nothing to poll — so the idle prompt doesn't read as "nothing happened."	2026-06-25 19:57:58 -05:00
Teknium	fd2a35b169	fix: stop reporting cache-hit rate and cost across all UI surfaces (#52717 ) * fix: stop reporting cache-hit rate and cost across all UI surfaces Cost estimates and cache read/write token reporting are unreliable on providers that don't surface cached_tokens (e.g. ollama-cloud, which doesn't implement prompt_tokens_details.cached_tokens), producing misleading near-zero 'cache hit' readouts and cost figures. Remove cost + cache-hit reporting from every user-facing surface; keep input/output/total token counts (provider-agnostic and accurate) and the Nous account billing UI (real account money, separate from per-conversation estimates). Surfaces: - CLI /usage + model-info: drop cost lines + cache read/write token lines - Gateway /usage + /model: drop cost + cache lines - tui_gateway/server.py: stop emitting cost_usd / cache_read in usage and subagent.complete payloads - TUI (Ink): drop cost from status bar (+ showCost plumbing), /usage panel, thinking rollup, agents overlay (incl. compare view); keep token counts - Desktop Command Center: drop cost stat, per-model cost, actual-cost hint Underlying estimate_usage_cost / format_cost / insights cost columns are left intact but no longer surfaced (display-only change, reversible). * test: update TUI + gateway + CLI tests for removed cost/cache-hit reporting - CLI /usage test asserts cost/cache lines are absent, tokens present - gateway /usage test drops cost + cache asserts; removes cost-included test - TUI subagentTree summary expectation drops the cost segment - useConfigSync + appChrome status-rule tests drop showCost prop/state	2026-06-25 15:21:22 -07:00
Teknium	c6575df927	feat(moa): expose MoA presets as selectable virtual models (#46081 ) * feat(moa): expose MoA presets as selectable virtual models Reconstructed onto current main (PR #46081's base had diverged with no common ancestor, marking the PR dirty so CI never dispatched). MoA is now a virtual provider: each named preset is a selectable model under provider 'moa', and the preset's aggregator is the acting model that answers and calls tools. Reference models fan out in parallel via a bounded ThreadPoolExecutor (the same batch pattern delegate_task uses) — all references dispatched at once, collected when every one finishes, then handed to the aggregator. Output order is preserved, failures and the MoA-recursion guard stay isolated per reference. - Removed the old mixture_of_agents model tool and moa toolset. - Added moa as a virtual provider in the provider/model inventory. - /moa is shortcut behavior over model selection (default preset / named preset / one-shot prompt). - Dashboard + Desktop manage named presets; presets appear in model pickers. - Parallel reference fan-out in agent/moa_loop.py with regression test. * fix(moa): thread moa_config through _run_agent to _run_agent_inner The reconstructed gateway MoA wiring declared moa_config on _run_agent (the profile-scoping wrapper) and used it inside _run_agent_inner, but the wrapper never forwarded it — _run_agent_inner had no such parameter, so the runtime hit NameError: name 'moa_config' is not defined on the compression-failure session sync path. Add moa_config to _run_agent_inner's signature and forward it from both wrapper call sites (multiplex and non-multiplex). Caught by tests/gateway/test_compression_failure_session_sync.py on CI shard test(4). * fix(moa): classify moa as a virtual provider in the catalog The moa virtual provider has no PROVIDER_REGISTRY/ProviderProfile entry, so provider_catalog() fell through to the default auth_type="api_key" with no env vars — tripping two catalog invariants: - test_provider_catalog: api_key providers must expose a credential env var - test_provider_parity: every hermes-model provider must be desktop-configurable moa already declares auth_type="virtual" in HERMES_OVERLAYS; consult that overlay as an auth_type fallback so the catalog reports moa as virtual (no real credential, no network endpoint). Exempt virtual providers from the desktop parity union check the same way 'custom' is exempt — derived from the catalog, not a hardcoded slug, so future virtual providers are covered too.	2026-06-25 13:52:06 -07:00
Brooklyn Nicholson	e495b33bf1	Merge remote-tracking branch 'origin/main' into bb/pets-merge # Conflicts: # hermes_cli/commands.py # tui_gateway/server.py	2026-06-23 19:05:22 -05:00
Teknium	70d28b62fb	feat(cli): track background subagents in the status bar (#51441 ) The classic prompt_toolkit status bar already shows two background indicators: ▶ N (/background agent threads) and ⚙ N (shell processes spawned by terminal(background=true)). Background/async subagents (delegate_task batches and background single delegations) had no indicator despite being long-running work the user should be able to see at a glance. Add a third indicator ⛓ N sourced from tools.async_delegation.active_count() — the count of delegations still in the 'running' state. Renders in the plain-text builder and the styled-fragment builder across the same width tiers as the other two (omitted on the narrow <52 tier), guarded so a raising active_count() leaves the snapshot at 0.	2026-06-23 11:09:08 -07:00
Teknium	ff85af3fc7	feat(goals): /goal wait <pid> — park the loop on a background process (#50503 ) * feat(goals): add /goal wait <pid> barrier to park the loop on a background process The /goal loop re-pokes the agent every turn via the post-turn judge. When a goal is gated on a long-running background process (CI poller, build, test matrix, deploy) that produces nothing to judge yet, this spins the agent into 'is it done?' busy-work and burns the turn budget. /goal wait <pid> [reason] parks the loop: while the PID is alive, the judge is skipped, no turn is consumed, no continuation fires, and /goal status shows a parked indicator. The barrier auto-clears the moment the process exits (the agent's notify_on_complete watcher is the natural wake signal), then the next turn resumes normal judging. /goal unwait clears it manually; pause/resume/clear drop it; a dead/stale PID can never wedge the loop. Wired across CLI, gateway, and the mid-run command guard for parity. Barrier persists in SessionDB.state_meta (survives /resume); GoalState gains backward-compatible waiting_on_pid/waiting_reason/waiting_since fields. 12 new tests; docs updated. * fix(goals): use gateway.status._pid_exists for liveness, not os.kill(pid,0) The Windows-footguns CI guard flagged os.kill(pid, 0) in _pid_alive — on Windows that's not a no-op, it routes to CTRL_C_EVENT and hard-kills the target's console process group (bpo-14484). Delegate to the canonical footgun-safe gateway.status._pid_exists (psutil + ctypes/POSIX fallback) instead, with a direct-psutil last resort. * feat(goals): judge-driven auto-wait — the loop parks itself, no manual /goal wait Makes the wait barrier automatic. Every turn the judge is shown the agent's live background processes (pid, command, uptime, output tail from the process_registry) alongside the goal + response, and can return a new 'wait' verdict instead of continue: {"verdict":"wait","wait_on_pid":N} → park until that process exits {"verdict":"wait","wait_for_seconds":N} → park until the deadline passes evaluate_after_turn acts on the directive (sets the barrier, parks the loop) so the agent isn't re-poked into busy-work while CI/builds/deploys run. Adds a time-based waiting_until barrier alongside the pid barrier; both auto-clear and can never wedge the loop. Drivers (CLI, gateway, tui_gateway) feed the live registry in via gather_background_processes(). Manual /goal wait stays as an override. Judge verdict contract widened to (verdict, reason, parse_failed, wait_directive); legacy {"done":bool} shape still accepted. * test(goals): update kanban _fake_judge to the 4-tuple judge contract CI test(3) caught it: test_kanban_goal_mode's _fake_judge still returned the 3-tuple (verdict, reason, parse_failed), but the kanban loop now unpacks the 4-tuple (+ wait_directive). Update the fake to return None for the directive and accept the background_processes kwarg. * feat(goals): trigger-based wait — park on a process's own signal, not just exit Addresses two gaps in the judge-driven wait: (1) the judge could only express 'wait until PID exits' or 'wait N seconds', so a long-lived watcher/server that fires a trigger MID-RUN (and may never exit) couldn't be waited on; (2) the process's own watch_patterns/notify_on_complete trigger was invisible to the judge. Adds a session-based barrier (waiting_on_session) that releases on the process's OWN trigger via process_registry.is_session_waiting(): the session exits, OR (if started with watch_patterns) its pattern matches — even while the process keeps running. list_sessions() now surfaces session_id + watch_patterns/watch_hit/ notify_on_complete so the judge sees the trigger and is told to prefer wait_on_session for trigger processes. Judge verdict gains a {wait_on_session} directive (preferred over pid). Backward-compatible GoalState field; pid + time barriers unchanged. Tests: TestSessionTriggerBarrier (release on mid-run pattern match while alive, release on exit, unknown-session, full park→trigger→resume, parse, validation, backcompat load). 105 goal-surface + 85 process_registry tests green.	2026-06-22 06:27:29 -07:00
Brooklyn Nicholson	5342eccf12	Merge remote-tracking branch 'origin/main' into bb/pets	2026-06-22 05:25:49 -05:00
Teknium	7130d60861	feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492 ) * feat(providers): remove google-gemini-cli + google-antigravity OAuth providers Google now actively bans accounts for third-party tools that piggyback on Gemini CLI / Antigravity / Code Assist OAuth, and because abuse prevention sits at a backend layer the ban can extend to the entire Google account (Gmail/Drive), with a second violation being permanent. Ref: https://github.com/google-gemini/gemini-cli/discussions/20632 Removes both OAuth inference providers entirely (modules, provider profiles, auth/runtime/config/models wiring, the /gquota Code Assist quota command, the antigravity-cli optional skill, desktop + docs surface in en + zh-Hans). The API-key 'gemini' provider (GOOGLE_API_KEY/GEMINI_API_KEY against generativelanguage.googleapis.com) is unaffected and stays fully supported. * fix(skills): keep the antigravity-cli skill — only the OAuth provider is removed The antigravity-cli optional skill orchestrates the external `agy` binary as a coding-agent tool via the terminal tool — it does NOT wrap Hermes inference through the banned google-antigravity OAuth provider, so it carries none of the account-ban risk that motivated removing that provider. Restore the skill, its docs page, the sidebar entry, and the optional-skills catalog row. The google-antigravity / google-gemini-cli inference providers stay fully removed.	2026-06-21 19:53:27 -07:00
Teknium	824c9d3812	fix(config): alias model.api_base -> model.base_url for custom providers (#50385 ) A bare custom provider configured via `model.api_base` (the intuitive name OpenAI-SDK / LiteLLM users reach for) was silently ignored: `hermes config set` accepts any dotted key, so `model.api_base` got written and confirmed, but the runtime resolver reads only `model.base_url`. Requests fell back to OpenRouter with an empty key -> 401, zero hits to the custom endpoint (issue #8919). Now api_base is migrated to base_url at load time (fixes existing broken configs) and at set time (with a notice), never overriding an explicit base_url. Closes #8919.	2026-06-21 13:33:41 -07:00
Teknium	b6d1072408	fix(cli): branch new worktrees from the fresh remote tip, not stale local HEAD (#50355 ) hermes -w created the worktree branch from the standalone clone's HEAD, which lags origin when the clone isn't freshly updated (it's only refreshed by hermes update, not per session). Every worktree branch then rooted on a stale base, so the PR diff GitHub computes against current main ballooned with unrelated changes and the agent had to discover the staleness at push time and rebase. _resolve_worktree_base() now fetches and branches from the freshest available ref: the current branch's upstream if it tracks one (so a deliberate feature-branch worktree tracks its own remote), else the remote's default branch (origin/HEAD), else local HEAD as a fail-soft fallback (offline / no remote / detached). A bogus 'origin/(unknown)' default is guarded, and worktree creation retries from HEAD if branching off the remote ref fails — so this is never worse than the old behavior. Gated by worktree_sync (default true); set worktree_sync: false to keep the old branch-from-local-HEAD behavior. The resolved base is printed in the session banner. This is the follow-up to the #50319 session, where the standalone clone was 213 commits behind origin and the worktree inherited that stale base.	2026-06-21 12:42:11 -07:00
Hariharan Ayappane	99233faf78	fix(cli): persist sessions before shutdown	2026-06-21 07:25:56 -07:00
Brooklyn Nicholson	83aa84ae3b	feat(pets): CLI pet pane + /pet command Render the reactive pet pane in the classic CLI (steady redraw, right-aligned) and wire the /pet command to list and switch pets, plus an enable/disable toggle. Backed by hermes_cli/pets.py and the CLI commands mixin, registered in the central command registry. Covered by the CLI pet pane and toggle tests.	2026-06-20 14:18:33 -05:00
helix4u	c253b07380	fix(model): clear stale endpoint credentials across switches	2026-06-19 19:58:26 -07:00
helix4u	95a3affc2e	fix(model): keep Nous picker from restoring stale custom keys	2026-06-19 19:58:26 -07:00
teknium1	64b21e50fb	fix(cli): publish agent ref to cli module so memory on_session_end fires on exit The god-file Phase 4 refactor (`094aa85c37`) moved agent construction into CLIAgentSetupMixin, which set the atexit shutdown reference with a bare `global _active_agent_ref`. After extraction that global binds the mixin module's namespace, not cli.py's. cli._run_cleanup reads cli._active_agent_ref to decide whether to fire the memory provider's on_session_end hook — and it stayed None for the whole session, so the `if _active_agent_ref:` branch was dead and on_session_end never ran on /exit. Custom memory providers silently lost end-of-session extraction. Fix: publish the reference onto the cli module explicitly (`import cli as _cli; _cli._active_agent_ref = self.agent`), using the deferred-import pattern already established in the mixin. Regression test asserts cli._active_agent_ref is populated by the mixin's publish line and guards against a relapse to the bare `global` form. The existing shutdown tests passed only because they hand-assigned the ref, which is exactly what masked this.	2026-06-19 16:59:43 -07:00
Teknium	c06898098b	fix(cli): clear viewport on width-change resize so the status bar can't duplicate (#49120 ) The classic CLI status bar could appear twice after a horizontal terminal resize — two bars at two widths with two different elapsed readings. Root cause: prompt_toolkit's Application._on_resize() calls renderer.erase(), which does cursor_up(_cursor_pos.y) + erase_down() using the _cursor_pos.y cached from the LAST render at the OLD width (renderer.py:745). On a column shrink the terminal reflows the already-painted full-width chrome into extra physical rows, so the cached y undershoots: cursor_up doesn't climb past the reflowed rows and erase_down leaves the old bar stranded ABOVE the live origin. The next paint stacks a fresh bar below it. The existing post-resize suppression hides the NEW bar for ~0.35s but never erases the already-reflowed OLD one, so the ghost survives the whole window. Ctrl+L / /redraw clears it, confirming a viewport wipe is the fix. Fix: on a WIDTH change, _recover_after_resize now routes through the same recovery as Ctrl+L — _clear_prompt_toolkit_screen(rebuild_scrollback=False) (CSI 2J, visible viewport only) + _replay_output_history() — BEFORE delegating to prompt_toolkit's resize. Banner-safe: 2J never touches scrollback history (that's CSI 3J, which we don't send here), so the startup banner is preserved. Rows-only resizes skip the clear (no reflow → no ghost) to avoid an extra repaint. Tracks _last_resize_width to distinguish the two. Tests: replace the now-obsolete 'never clears on resize' assertion with two tests — rows-only resize delegates without clearing; width change clears the viewport + replays and never wipes scrollback.	2026-06-19 08:43:42 -07:00
Teknium	1b04e4ede5	fix(cli): status bar no longer stays hidden after resize during idle (#49105 ) The classic CLI status bar could vanish for the rest of a session: any terminal reflow (SIGWINCH from a tmux pane change, SSH window restore, font zoom) set _status_bar_suppressed_after_resize=True, but the flag was ONLY cleared on the next submitted user input. Resize then sit idle and the bottom chrome rendered at height 0 on every repaint — even with the refresh clock ticking — so the bar was gone until you typed and hit enter. Fix: _recover_after_resize now schedules a debounced unsuppress timer that clears the flag and repaints once the reflow settles (~0.35s), so the bar returns on its own during idle. The next-submit clear stays as a fast path. Fails open: any error in scheduling clears the flag immediately rather than leaving the bar stuck hidden.	2026-06-19 07:53:58 -07:00
Siddharth Balyan	73cd8622f9	feat(billing): /billing terminal billing — interactive TUI + CLI client (#45449 ) * feat(billing): nous_billing http client + BillingState core (phase 2b) Phase 2b terminal-billing client foundation: - hermes_cli/nous_billing.py: typed client for the 4 /api/billing/* endpoints (state/charge/poll/auto-top-up). Raises typed errors (BillingScopeRequired, BillingRateLimited, BillingAuthError) mapped from the live-verified contract; fail-open is the caller's job. Idempotency-Key enforced client-side. - agent/billing_view.py: surface-agnostic BillingState core + Decimal money parsing (server emits decimal strings, not 2dp), fail-open builder, idempotency-key gen, custom-amount validation. - 51 unit tests (decimal parse/format, payload tiering, error->exception matrix, fail-open, amount validation). Plan: docs/plans/2026-06-13-001-phase-2b-terminal-billing-tui-plan.md * feat(billing): billing:manage scope + lazy step-up re-auth (phase 2b) - NOUS_BILLING_MANAGE_SCOPE constant. - nous_token_has_billing_scope(): split-based scope check (no false-positive substring match). - step_up_nous_billing_scope(): re-runs the device flow requesting billing:manage, reusing the held credential's portal/inference URLs + client_id (so a preview stays a preview), persists like _login_nous but WITHOUT the model picker. Returns True iff the minted token carries the scope (False when NAS silently downscopes a non-admin / unticked grant). Lazy step-up (plan D-A): normal login path unchanged; 403 insufficient_scope from a billing call triggers this. 7 unit tests. * feat(billing): billing JSON-RPC methods for the TUI (phase 2b) billing.state / charge / charge_status / auto_reload / step_up in tui_gateway/server.py. Return STRUCTURED success envelopes (result.ok + result.error=<code>) rather than JSON-RPC-level errors, so the Ink rpc() promise always resolves and the TUI branches on the typed billing error code (insufficient_scope, rate_limited, no_payment_method, …) to render the right affordance. Money serialized as decimal STRINGS + display strings. charge mints + echoes an idempotency_key for retry reuse. 16 unit tests. * feat(billing): /billing CLI handler + command registry (phase 2b) - CommandDef("billing", subcommands=buy\|auto-reload\|limit), added to _SLACK_VIA_HERMES_ONLY so it routes via /hermes on Slack (keeps the 50-cap parity test green, same as /credits). - cli.py::_show_billing + screen helpers: all 5 screens (overview, buy→confirm→ poll, auto-reload, monthly-limit read-only). Reuses _prompt_text_input_modal / _prompt_text_input (D-C). Non-interactive (_app is None) renders text + portal deep-link, never prompts (R7). Decimal money end-to-end. 2s/5-min cancellable poll loop; 429/503 = retry not failure; settled = ledger truth. Lazy step-up on 403 insufficient_scope. no_payment_method treated as mainline funnel-to-portal. - 6 CLI tests; 156 command tests (incl. Slack/Telegram parity) green. * feat(billing): /billing Ink TUI screens + tests (phase 2b) - ui-tui/src/app/slash/commands/billing.ts: /billing TUI command covering all 5 screens — overview (text), buy <amt> → ConfirmReq → charge → non-blocking 2s/ 5-min poll loop → settled/failed/timeout branches, auto-reload <below> <to> → ConfirmReq → PATCH, limit (read-only). Reuses the existing ConfirmReq overlay (D-C) — no bespoke component. Typed-error envelope branching: insufficient_scope arms the lazy step-up confirm; no_payment_method/rate_limited/cap funnel to portal. Client-side amount validation mirrors the server (bounds + 2dp). - gatewayTypes.ts: Billing* response interfaces. - registry.ts: register billingCommands. - billingCommand.test.ts: 12 vitest cases (overview/gating/buy-confirm-poll- settled/no_payment_method/step-up/limit/auto-reload/validation). TUI build green; 12/12 vitest pass; slash tests pass once @hermes/ink is built. * docs(billing): scrub private cross-repo references NAS is a private repo — remove all references to it from the public PR: - drop the cross-repo planning doc (planning scaffolding, not a deliverable; the PR description documents the design) - replace 'NAS' / 'PR #412 preview' mentions in code + test comments with generic 'the server' / 'a preview deployment' * docs(billing): scrub final NAS reference in step-up docstring * docs(billing): drop dangling plan-doc refs The phase-2b plan doc was removed in the cross-repo scrub (300afcc0b) but two module docstrings still pointed at it. Drop the dead refs. * feat(billing): interactive /billing overlay + step-up UX, portal-URL & token fixes Adds the interactive /billing TUI overlay and hardens the terminal-billing client across CLI and TUI. - TUI: full /billing overlay state machine (overview to buy to confirm, auto-reload, read-only monthly limit) reusing the existing confirm overlay. - Step-up: surface the verification link in-transcript and open the browser via the TUI's own opener (the device flow runs in the headless gateway, so a printed URL was being dropped); run the step-up handler off the main loop and emit the link as an out-of-band event so the gateway stays responsive. - Step-up copy is scope-accurate ("Billing permission granted") and re-checks /state so it never claims "enabled" when the org kill-switch is still off. - Portal deep-links resolve to absolute URLs against the active portal base (the server emits them relative) - fixes a bare "/billing?topup=open" link. - Billing calls refresh an expired access token via the stored refresh token instead of reporting a false "not logged in". - Optimistic funnel: advise "set up a saved card on the portal" up front when no card is on file (advisory, not a hard gate). - Token resolution is cached briefly so the 2s charge poll loop stops re-locking + re-reading the auth store on every tick; 401 re-resolves fresh. - Remove the temporary demo-mode shims. Validation: 87 Python billing tests, 88 TS tests (billing command + gateway event handler), tsc clean, ink + ui-tui builds green. * docs(billing): add /billing TUI screenshots for PR * fix(cli): guard _last_invalidate on bare instances; update stale prompt-fallback test The UI-invalidate throttle read self._last_invalidate unconditionally, which raised AttributeError on HermesCLI instances built without __init__ (the thread-safety test's object.__new__ shell). Guard the read with getattr. The off-main-thread branch of _prompt_text_input was changed (#23185) to cancel cleanly to None instead of falling back to a bare input() that would hang on the slash-worker thread; the test still asserted the old direct-input fallback. Update it to assert the current intended behavior: returns None, calls neither run_in_terminal nor input(), and does not hang.	2026-06-19 01:53:32 +05:30
H-Ali13381	2abcae9678	fix(cli): preserve renderer state on resize	2026-06-13 05:40:18 -07:00
Teknium	4474873d2c	feat(cli): persist resolved approval/clarify prompts in scrollback (#44702 ) Modal prompt panels (dangerous-command approval, clarify questions) live in the prompt_toolkit layout and vanish on the next repaint, leaving no trace of the question or the decision in chat history. Emit a dim one-line summary after each prompt resolves: ⚠ Approval: <command> → allowed for session ? Clarify: <question> → <answer> Gated on display.persist_prompts (default true). Detail and outcome are whitespace-collapsed and capped at 120 chars.	2026-06-12 01:14:35 -07:00
墨綠BG	81cdbbddc8	🐛 fix(cli): wrap approval preview hints	2026-06-11 23:05:08 -07:00
墨綠BG	d6df38bb6b	🐛 fix(cli): wrap long approval commands in prompt	2026-06-11 23:05:08 -07:00
Teknium	8972a151a4	feat(cli,tui): show time since last final agent response on the status bar (#44265 ) Adds an idle clock to the context/status bar in both the prompt_toolkit CLI and the Ink TUI: once a turn completes, a dim '✓ <elapsed>' segment shows how long the session has been idle since the last final agent response. Hidden while a turn is live (the per-prompt elapsed timer covers that) and before the first turn completes. - cli.py: track _last_turn_finished_at when the agent thread exits, surface it via _format_idle_since() in the snapshot, render in both the wide fragments path and the plain-text fallback. - ui-tui: stamp lastTurnEndedAt when busy flips false after a live turn, thread it through appStatus -> StatusRule, render via a ticking IdleSince segment sharing the duration breakpoint/width budget.	2026-06-11 06:06:19 -07:00
mnajafian-nv	f8fd30942c	fix(cli): prevent duplicate one-shot finalize on interrupted cleanup (#43320 ) Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 22:41:04 -07:00
mnajafian-nv	d03cdd63eb	fix(cli): run one-shot query cleanup before lease release (#43036 ) * fix(cli): run one-shot query cleanup before lease release Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> * test(cli): cover quiet one-shot cleanup finalization Signed-off-by: mnajafian-nv <mnajafian@nvidia.com> --------- Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>	2026-06-09 21:52:13 -07:00
teknium1	b5f8996ccc	test(cli): exercise real _prompt_text_input for native-Windows confirm deadlock The existing #33961 tests mock _prompt_text_input away, so they only assert modal-vs-stdin routing — they cannot observe the actual hang. Add a guard class that drives the real helper chain with a blocking input() on a win32 daemon thread and asserts the worker never hangs. Fails on the pre-#33961 code (win32 -> _prompt_text_input -> off-main input() -> deadlock), passes on the modal path. Also covers the scheduling-failure degraded branch (must clean-cancel to None, never call input()).	2026-06-08 15:53:28 -07:00
firefly	714183530b	test(cli): convert stale win32 stdin-fallback tests to the modal contract The four win32 tests asserted the old deadlocking behavior (win32 -> raw input()). Rewrite them to the corrected contract: native Windows uses the modal via the app loop, and stdin is kept only for the safe no-app / scheduling-failure cases. Consolidate three near-identical daemon-thread tests into one parametrized (linux/win32) test behind a shared _run_on_daemon harness, and drop dead code from the old main-thread test. Refs #33961	2026-06-08 15:53:28 -07:00
firefly	d66bac5a1a	test(cli): failing regression test for native-Windows confirm deadlock (#33961 )	2026-06-08 15:53:28 -07:00
teknium1	0904bc7ea2	refactor(cli): extract 32 slash-command handlers into CLICommandsMixin (god-file Phase 4) Lift the `_handle_*_command` cluster (2,077 LOC) out of HermesCLI into hermes_cli/cli_commands_mixin.py; HermesCLI now inherits CLICommandsMixin so every self.<handler> call resolves unchanged via the MRO. Behavior-neutral. Import discipline mirrors gateway/slash_commands.py (PR #41886): neutral deps imported at the mixin module top level; cli.py-internal helpers/constants (_cprint, _ACCENT, save_config_value, ...) imported lazily inside each handler via 'from cli import ...' so the mixin never imports cli at module scope. cli.py 16215 -> 14139 LOC. One test mock repointed (cli.is_browser_debug_ready -> hermes_cli.cli_commands_mixin.is_browser_debug_ready).	2026-06-08 02:13:07 -07:00
kshitijk4poor	8e71b5136b	fix(cli): paint approval/clarify/sudo/secret modal prompts directly, not via the throttle (#41098 ) In classic CLI mode the dangerous-command approval prompt (and the clarify, sudo, and secret-capture prompts) could fail to render: the user saw '⏱ Timeout — denying command' after 60s without ever seeing the panel, making approvals.mode: manual unusable. Root cause. These prompts run their wait loop on the agent/background thread: they set modal state that a ConditionalContainer's filter reads, then call self._invalidate() to repaint so the panel appears. _invalidate() is a THROTTLED wrapper built for high-frequency background repaints (spinner frames, streaming) — it (a) returns early while a SIGWINCH resize-recovery is pending, and (b) otherwise only repaints if 250ms elapsed since the last paint. Under either condition the modal's entry paint is silently dropped, the ConditionalContainer never re-evaluates, and the prompt times out unseen. The throttle never belonged on these paths. Originally the callbacks painted with a direct self._app.invalidate() and worked; a throttle PR blanket-replaced every invalidate (including these rare, one-shot, user-blocking modal paints) with the throttled _invalidate(); a later commit removed an idle 1Hz repaint that had been masking dropped modal paints, surfacing the bug. Notably the modal KEY-BINDING handlers (↑/↓/Enter) already paint with a direct event.app.invalidate(), never the throttle — the background-thread callbacks were the inconsistent ones. Fix. Add a small _paint_now() helper that paints directly (guarded for a missing _app, exception-safe) and route the four modal paths' entry, response, countdown, and teardown paints through it — matching the key-handler idiom. This covers approval, clarify, sudo, and the secret-capture teardown (_submit_secret_response, which previously used the throttled _invalidate() so its panel could linger after submit). _invalidate() is left untouched and its docstring now states it is for high-frequency background repaints only; modal/interactive paints must use _paint_now()/_app.invalidate() directly. This also fixes the resize-recovery edge case for free (a direct paint never consults the resize guard) without a throttle-bypass flag that could be cargo-culted onto hot paths. Countdown refresh cadence tightened 5s->1s so the timer stays visible while waiting, and a copy-pasted duplicate countdown block in _clarify_callback is removed. Tests: TestModalPaintNow drives all three wait-loop callbacks on a background thread with BOTH gates active (_resize_recovery_pending=True + a recent _last_invalidate in the throttle window) and asserts the panel paints on entry AND repaints on teardown; plus a secret-teardown test, a direct _paint_now-vs-_invalidate gate test, and a no-_app safety test. Each modal test fails if its paint is reverted to _invalidate(). 17 in-file tests pass; full tests/cli suite green (900). Diagnosis credit: the throttle-drop root cause was identified by @sanidhyasin in #41116; @islam666 independently reached the same direct-invalidate approach in #41166; original report #41098 by @jodonnel.	2026-06-08 00:46:43 +05:30

1 2 3 4 5

233 commits