hermes-agent

Author	SHA1	Message	Date
Teknium	6eb39c2bbe	fix(opencode-go): heal stripped /v1 base_url so non-minimax models stop 404ing (#57585 ) OpenCode Go serves minimax/qwen via Anthropic Messages (base URL without /v1 — the SDK appends /v1/messages) and glm/kimi/deepseek/mimo via OpenAI chat completions (base URL WITH /v1). The runtime stripped /v1 for anthropic-routed models, and the TUI/desktop + gateway persisted that stripped URL to model.base_url. Every later chat_completions model then POSTed to https://opencode.ai/zen/go/chat/completions — a 404 (the marketing site). Result: only minimax worked; glm/deepseek/kimi all 404ed. - New normalize_opencode_base_url(): symmetric /v1 normalization — strip for anthropic_messages, re-append for chat_completions / codex_responses on opencode.ai hosts (heals persisted stripped URLs; custom proxy overrides untouched) - Applied at all three former one-way strip sites (resolve_runtime_provider x2, switch_model) - opencode_model_api_mode: all Qwen models on Go AND Zen now route via /v1/messages per current published endpoint tables (previously only qwen3.7-max on Go — qwen3.6-plus etc. would 404 the same way) - Catalog refresh: Go gains deepseek-v4-pro/flash, glm-5.2, kimi-k2.7-code, minimax-m3, qwen3.7-plus; Zen gains glm-5.2, kimi-k2.7-code, minimax-m3, qwen3.7-plus Reported by IndieSuperhuman on X: opencode-go 404s for any model other than minimax.	2026-07-03 00:46:45 -07:00
kshitijk4poor	ed4123792c	refactor(providers): dedupe extra_headers normalizer + key picker groups by headers Follow-up to @helix4u's #57336 salvage. Two review findings: - W1: model-picker grouped custom-provider rows by (api_url, credential, api_mode) but NOT extra_headers. Entries sharing a URL+credential+api_mode yet declaring different headers (e.g. per-tenant routing behind one proxy) collapsed into one row and probed /models with whichever header set was seen first (order-dependent). Fold a canonical header identity into group_key so distinct header-authed endpoints stay separate; drops the now-dead first-non-empty merge branch. - W2: the extra_headers stringify+None-filter comprehension existed in 5 copies (config.py x2, runtime_provider.py, model_switch.py, models.py). Extract one shared hermes_cli.config.normalize_extra_headers primitive; all sites now call it. Tests: +normalize_extra_headers unit tests, +regression test proving two same-endpoint entries with different headers stay distinct and each probes with its own headers. 223 targeted tests pass; ruff clean.	2026-07-03 04:23:15 +05:30
helix4u	ab40e952f3	fix(providers): pass extra headers to model discovery	2026-07-03 04:23:15 +05:30
kchantharuan	048270fa06	fix: refresh NVIDIA featured models	2026-07-03 03:00:30 +05:30
Teknium	76a468e513	feat(models): add claude-fable-5, claude-sonnet-5, fugu-ultra to curated OpenRouter + Nous lists (#56617 ) - claude-fable-5 placed above claude-opus-4.8 in both curated lists - claude-sonnet-5 replaces claude-sonnet-4.6 - sakana/fugu-ultra added near the bottom (before routers/free tier) - regenerated website/static/api/model-catalog.json via scripts/build_model_catalog.py (live-pulled by CLI, published on merge — no release needed)	2026-07-01 13:21:42 -07:00
Steve Lawton	c73e74386b	feat(vertex): add Google Vertex AI provider for Gemini (OAuth2) Adds Vertex AI as a first-class provider for Gemini models via Vertex's OpenAI-compatible endpoint. Vertex authenticates with short-lived OAuth2 access tokens (service-account JSON or ADC), not a static API key — the missing piece behind the recurring requests (#13484, #12639, #56259). - agent/vertex_adapter.py: OAuth2 token minting + refresh-on-expiry (5-min margin), ADC->service-account fallback, global vs regional endpoint URLs. Config precedence: env var > config.yaml > default. - plugins/model-providers/vertex/: provider profile (auth_type=vertex), reuses Gemini's extra_body.google.thinking_config translation. - runtime_provider: vertex short-circuit BEFORE the credential pool so a credentials-file path is never mistaken for a static API key; mints a fresh token + computes base_url per resolve. - run_agent + conversation_loop: _try_refresh_vertex_client_credentials() re-mints the token and rebuilds the client on a mid-session 401, so a long-lived gateway agent survives token expiry (~1h). - auxiliary_client: vertex auth_type branch for side-LLM tasks. - config.yaml: vertex.project_id / vertex.region (non-secret, bridged to env); credential path stays in .env (VERTEX_CREDENTIALS_PATH). - setup wizard + model picker: dedicated _model_flow_vertex; curated google/gemini-* model list; --provider choices. - pricing/metadata: Vertex prices off the gemini docs snapshot; endpoint host auto-maps to the vertex provider (no probe spam). - lazy_deps + pyproject [vertex] extra: google-auth, opt-in only. - docs: guides/google-vertex.md + providers page; tests for adapter + runtime resolution. Salvages and modernizes #8427 by @slawt onto current main: rewired from the legacy PROVIDER_REGISTRY path to the provider-profile architecture, moved non-secret config out of .env into config.yaml, and added the per-turn 401 token-refresh the original lacked.	2026-07-01 05:25:33 -07:00
lkevincc	163562bf88	fix: normalize lmstudio base urls	2026-06-28 20:46:44 -07:00
teknium1	f54c52800a	fix(models): scope live-first picker merge to opencode aggregators only Follow-up to the salvaged #49129 commit. The original change flipped the shared generic-provider merge in provider_model_ids() to live-first unconditionally, which regressed curated-first for single providers (kimi/zai, #46309) — and the PR encoded that regression by flipping the kimi-coding and zai test assertions to expect live-first. Gate live-first on an explicit _LIVE_FIRST_PICKER_PROVIDERS set ({opencode-zen, opencode-go}); every other provider keeps curated-first. Also widen the uncapped picker + live-first sets to opencode-go, which has the same 70+ model catalog problem as opencode-zen. Restore the kimi-coding curated-first test and rewrite the merge-order test to assert the per-provider contract.	2026-06-27 21:23:25 -07:00
Afnath Ahamed	f98ffbc246	fix(models): live-first merge + update opencode-zen catalog + uncap aggregator picker	2026-06-27 21:23:25 -07:00
Teknium	c6575df927	feat(moa): expose MoA presets as selectable virtual models (#46081 ) * feat(moa): expose MoA presets as selectable virtual models Reconstructed onto current main (PR #46081's base had diverged with no common ancestor, marking the PR dirty so CI never dispatched). MoA is now a virtual provider: each named preset is a selectable model under provider 'moa', and the preset's aggregator is the acting model that answers and calls tools. Reference models fan out in parallel via a bounded ThreadPoolExecutor (the same batch pattern delegate_task uses) — all references dispatched at once, collected when every one finishes, then handed to the aggregator. Output order is preserved, failures and the MoA-recursion guard stay isolated per reference. - Removed the old mixture_of_agents model tool and moa toolset. - Added moa as a virtual provider in the provider/model inventory. - /moa is shortcut behavior over model selection (default preset / named preset / one-shot prompt). - Dashboard + Desktop manage named presets; presets appear in model pickers. - Parallel reference fan-out in agent/moa_loop.py with regression test. * fix(moa): thread moa_config through _run_agent to _run_agent_inner The reconstructed gateway MoA wiring declared moa_config on _run_agent (the profile-scoping wrapper) and used it inside _run_agent_inner, but the wrapper never forwarded it — _run_agent_inner had no such parameter, so the runtime hit NameError: name 'moa_config' is not defined on the compression-failure session sync path. Add moa_config to _run_agent_inner's signature and forward it from both wrapper call sites (multiplex and non-multiplex). Caught by tests/gateway/test_compression_failure_session_sync.py on CI shard test(4). * fix(moa): classify moa as a virtual provider in the catalog The moa virtual provider has no PROVIDER_REGISTRY/ProviderProfile entry, so provider_catalog() fell through to the default auth_type="api_key" with no env vars — tripping two catalog invariants: - test_provider_catalog: api_key providers must expose a credential env var - test_provider_parity: every hermes-model provider must be desktop-configurable moa already declares auth_type="virtual" in HERMES_OVERLAYS; consult that overlay as an auth_type fallback so the catalog reports moa as virtual (no real credential, no network endpoint). Exempt virtual providers from the desktop parity union check the same way 'custom' is exempt — derived from the catalog, not a hardcoded slug, so future virtual providers are covered too.	2026-06-25 13:52:06 -07:00
Elshayib	1a435a6d5d	fix(model-switch): prevent custom-provider misattribution in model picker (#48305 ) When the current provider is a custom endpoint (custom or custom:), the model switch pipeline must NOT auto-switch to a native provider/OpenRouter based on a static-catalog match. The user explicitly configured their own endpoint and the same model name may be served there; silently rewriting model.provider destroys their config. - detect_static_provider_for_model(): skip the static-catalog scan when the current provider is custom/custom: - switch_model() Step e: extend is_custom to cover custom:* so the detect_provider_for_model() last-resort fallback cannot fire Salvaged from #48351 by Elshayib (authorship preserved). Fixes #48305	2026-06-24 19:34:33 +05:30
kshitijk4poor	1a174dfb50	fix(models): gate openai-codex/xai-oauth soft-accept to family-shaped slugs (#45006 ) Completes the #45006 fix. PR-base commit (configured-provider routing) handles the case where a typed model IS declared in user/custom provider config. This commit closes the other root: when a typed model is NOT in any config and the current provider is a soft-accepting one (openai-codex / xai-oauth), the hidden-model soft-accept (#16172 / #19729) would accept ANY unknown name as a hidden model — so `qwen3.5-4b` typed on a Codex-default session "succeeded" and mislabeled the provider as "OpenAI Codex" (the exact reported symptom), then 400'd on the next turn. Gate the soft-accept to slugs that plausibly belong to the provider's family (openai-codex -> gpt-/codex-/o1/o3/o4; xai-oauth -> grok-). Family-shaped unknown slugs are still soft-accepted (preserving the #16172 entitlement-gated hidden-model intent); unrelated names are rejected with actionable guidance to pin the right provider via `--provider <slug>` or the picker. Adds TestCodexSoftAcceptPlausibilityGate (5 tests): unrelated names rejected on codex/xai, family-shaped hidden slugs still accepted, real catalog models unaffected. Verified load-bearing.	2026-06-24 19:23:53 +05:30
Teknium	7130d60861	feat(providers): remove google-gemini-cli + google-antigravity OAuth providers (#50492 ) * feat(providers): remove google-gemini-cli + google-antigravity OAuth providers Google now actively bans accounts for third-party tools that piggyback on Gemini CLI / Antigravity / Code Assist OAuth, and because abuse prevention sits at a backend layer the ban can extend to the entire Google account (Gmail/Drive), with a second violation being permanent. Ref: https://github.com/google-gemini/gemini-cli/discussions/20632 Removes both OAuth inference providers entirely (modules, provider profiles, auth/runtime/config/models wiring, the /gquota Code Assist quota command, the antigravity-cli optional skill, desktop + docs surface in en + zh-Hans). The API-key 'gemini' provider (GOOGLE_API_KEY/GEMINI_API_KEY against generativelanguage.googleapis.com) is unaffected and stays fully supported. * fix(skills): keep the antigravity-cli skill — only the OAuth provider is removed The antigravity-cli optional skill orchestrates the external `agy` binary as a coding-agent tool via the terminal tool — it does NOT wrap Hermes inference through the banned google-antigravity OAuth provider, so it carries none of the account-ban risk that motivated removing that provider. Restore the skill, its docs page, the sidebar entry, and the optional-skills catalog row. The google-antigravity / google-gemini-cli inference providers stay fully removed.	2026-06-21 19:53:27 -07:00
Teknium	c768c4b71c	fix(antigravity): move model flow to model_setup_flows + stop bare-alias hijack CI on the salvage caught two issues the stale PR base masked: 1. The model-setup flows were extracted from main.py into hermes_cli/model_setup_flows.py after @pmos69 forked. The cherry-pick re-introduced a stale _model_flow_custom into main.py (duplicating the one main.py now imports) and put _model_flow_google_antigravity there too. Move the antigravity flow into model_setup_flows.py alongside its siblings and drop the stale _model_flow_custom dup. Fixes the getpass/stdin OSError in tests/cli/test_cli_provider_resolution.py. 2. google-antigravity re-exposes Claude/Gemini/GPT-OSS models, so its catalog was hijacking bare short aliases (`sonnet` -> google-antigravity instead of anthropic) in detect_static_provider_for_model via dict insertion order. Add _BORROWED_MODEL_PROVIDERS and defer those providers to a last-resort pass so a model's native vendor always wins alias/direct-catalog detection. Fixes tests/hermes_cli/test_models.py::test_short_alias_resolves_to_static_model.	2026-06-21 16:41:30 -07:00
pmos69	8baa4e9976	feat(cli): add native Antigravity OAuth provider	2026-06-21 16:41:30 -07:00
definitelynotguru	eaddeaf2e6	feat(xai): add grok-composer-2.5-fast to xAI OAuth model picker The model is callable via xAI OAuth but omitted from models.dev and /v1/models listings. Merge it into the curated xAI catalog so it appears in `hermes model` without requiring a custom model name.	2026-06-17 09:49:46 -07:00
liuhao1024	1b962f001e	fix(models): pass model.base_url to fetch_models in /model picker The /model interactive picker resolved a base_url from user credentials but never passed it to ProviderProfile.fetch_models(), causing the picker to always query the provider's hardcoded default endpoint instead of the user's custom URL (e.g. a company litellm proxy). - providers/base.py: add optional base_url parameter to fetch_models() - hermes_cli/models.py: pass resolved base_url to fetch_models() - Update all subclass overrides for signature compatibility - Add 6 regression tests covering override, fallback, and integration	2026-06-16 13:09:40 -07:00
Jaaneek	bbc842d31e	feat(xai): default to grok-build-0.1 Switch the default model for the xAI/Grok provider and the xAI web search backend from grok-4.3 to grok-build-0.1. grok-build-0.1 is already recognized by the model metadata, so no new model definition is required; grok-4.3 remains selectable.	2026-06-16 11:50:17 -07:00
kshitijk4poor	b2da39a0f3	feat: add z-ai/glm-5.2 to OpenRouter and Nous model lists Z.ai released GLM 5.2 on 2026-06-15, available on OpenRouter: - https://openrouter.ai/z-ai/glm-5.2 GLM-5.2 is Z.ai's flagship for long-horizon tasks, shipping a 1M-token context window (up from 200K on GLM 5.1) and tool calling. Per the OpenRouter API: text-only, context_length 1048576, tools supported. No separate -fast variant exists. The 1M context length, native zai picker entry, setup wizard, and Z.ai coding-plan auth entries for glm-5.2 already landed on main. This fills the remaining gap: the two aggregator surfaces where glm-5.1 appears but glm-5.2 did not. Changes: hermes_cli/models.py - Add z-ai/glm-5.2 to the OpenRouter fallback snapshot (OPENROUTER_MODELS) and the Nous Portal curated list (_PROVIDER_MODELS["nous"]), newest flagship first. Live catalogs surface it automatically when reachable; the fallback lists matter when the manifest fetch fails. website/static/api/model-catalog.json - Regenerated via scripts/build_model_catalog.py (not hand-edited) so the manifest stays in sync with the source lists; guarded by tests/hermes_cli/test_model_catalog.py.	2026-06-16 23:35:45 +05:30
kshitijk4poor	658ac1d866	fix(models): keep curated-first ordering in live+curated merge; use pure-catalog helper in validation The generic live+curated merge (commit `630b438`) seeded the merged list from live results, demoting curated-only models below live ones. That regressed #46309, which deliberately surfaces the newest curated model (kimi-k2.7-code) FIRST in the native picker even when the live /models listing lags. Restore curated-first ordering: curated entries lead (in catalog order), live-only entries are appended for discovery. This keeps the #46850 fix (zai glm-5.2 now appears) without the kimi regression. Also switch the validate_requested_model curated fallback (commit `ee7b8a4`) from provider_model_ids() — which triggers a second, uncached live /models fetch with its own 8s timeout and may resolve different credentials than the api_key/base_url just probed — to the pure-catalog helper _model_in_provider_catalog(). Membership is checked against the shipped catalog only, with no extra network call. Tests: restore the curated-first assertion in test_kimi_coding_live_catalog_does_not_hide_curated_k2_7_code; update the new merge tests to curated-first semantics; de-circularize the validation fallback tests to patch _PROVIDER_MODELS (the real source) instead of mocking the function under test.	2026-06-16 23:25:07 +05:30
liuhao1024	ee7b8a4672	fix(models): validate_requested_model falls back to curated catalog when live API omits model When live /v1/models responds but omits a model that exists in the curated static catalog, validate_requested_model now accepts it with a note instead of rejecting. This covers the /model slash-command path (the picker path was already fixed in the parent commit). Addresses review feedback from potatogim on #46857.	2026-06-16 16:24:11 +08:00
liuhao1024	630b43892d	fix(models): merge live API results with curated static catalog in generic provider path When a provider's live /v1/models endpoint returns a stale or incomplete list (e.g. Z.AI missing glm-5.2), the generic profile-based code path returned only the live results, silently dropping curated models. Generalize the kimi-coding merge pattern to all providers: live entries come first (provider's preferred order), then curated-only entries are appended with case-insensitive dedup. This ensures models that the live endpoint omits still appear in /model picker. Fixes #46850	2026-06-16 16:21:01 +08:00
Teknium	2a14e8957d	fix(kimi): surface K2.7 Code in native picker (#46309 )	2026-06-14 14:01:03 -07:00
mr-r0b0t	bff78a34dc	feat(zai): add GLM-5.2 with verified 1M context window GLM-5.2 ships with a 1M (1,048,576) token context window. Without this entry, Hermes falls through to the generic 'glm' key (202,752 tokens), under-reporting the context bar and prematurely compressing conversations. The 1M limit was verified empirically via needle-in-a-haystack retrieval at 789,240 prompt tokens on api.z.ai/api/coding/paas/v4 — zero errors, zero truncation, correct retrieval at every tested size (25K through 789K). Changes: - agent/model_metadata.py: add 'glm-5.2': 1_048_576 before 'glm' fallback - hermes_cli/models.py: add glm-5.2 to zai curated models - hermes_cli/setup.py: add glm-5.2 to setup wizard zai list - hermes_cli/auth.py: add glm-5.2 to coding plan endpoint probes - plugins/model-providers/zai/__init__.py: add glm-5.2 to fallback_models - tests/agent/test_model_metadata.py: context resolution + vendor-prefix tests	2026-06-14 13:50:36 -07:00
helix4u	85e6232a07	fix(providers): support anthropic proxy v1 endpoints	2026-06-14 02:09:16 -07:00
Teknium	bc060c7c1c	fix(models): remove unavailable claude-fable-5 (#45492 )	2026-06-13 02:03:50 -07:00
Teknium	9688c1a94f	chore: add Kimi K2.7 code catalog slug (#45283 )	2026-06-12 16:55:40 -07:00
Teknium	57c6714995	fix(models): keep curated Anthropic aliases in /model picker (#43103 ) The Anthropic picker returned the live /v1/models dump verbatim whenever credentials were configured. Anthropic's API lags newly-routed curated aliases (e.g. claude-fable-5, reachable on Anthropic before the models endpoint enumerates it), so the curated entry vanished from the picker. Merge curated _PROVIDER_MODELS["anthropic"] with the live catalog — curated first, live-only appended, deduped — mirroring the OpenAI curated-merge path. Live failure / no creds falls back to curated verbatim.	2026-06-09 14:45:19 -07:00
emozilla	d7886da08c	add Fable 5 to model list for Anthropic provider	2026-06-09 15:33:42 -04:00
Teknium	ff9c110d5a	feat(models): add anthropic/claude-fable-5 to openrouter + nous curated lists (#42979 ) Adds the model above claude-opus-4.8 in both the OpenROUTER_MODELS and _PROVIDER_MODELS['nous'] curated picker lists used by /model and `hermes model`. Regenerated website/static/api/model-catalog.json to match.	2026-06-09 10:20:37 -07:00
Teknium	e687292eb4	feat(models): persist Nous recommended-models to disk; fall back on Portal failure (#42628 ) The Portal's /api/nous/recommended-models endpoint is the source of truth for which models are free/paid right now, but its result was cached in-process only. When the live fetch failed (network, parse, non-2xx), the function returned {} and the model picker silently dropped the free/paid recommendations — free models would vanish with no indication anything went wrong. Add a per-base disk cache at $HERMES_HOME/cache/nous_recommended_cache.json: a successful live fetch is persisted as last-known-good, and a failed fetch with an empty in-process cache falls back to the disk copy instead of {}. Self-heals on the next successful fetch. With no disk copy, still degrades to {} (callers already handle that). Keyed by portal base URL so staging/prod don't collide. E2E: live fetch writes disk; simulated Portal failure returns the cached free models from disk; no-disk + failure returns {}.	2026-06-09 00:03:43 -07:00
Teknium	c4066091ca	feat(models): add laguna-m.1 + nemotron-3-ultra to curated OpenRouter list (#42629 ) Two new free-tier slugs surfaced in /model and `hermes model`. owl-alpha was already present. Regenerated website/static/api/model-catalog.json to keep the manifest sync test green.	2026-06-08 23:05:35 -07:00
teknium1	3da44dbda7	fix(models): use deepseek-v4-flash as Nous silent default Follow-up on the salvaged fix: point the Nous silent-default override at deepseek/deepseek-v4-flash (a cheap chat model) instead of the nvidia nemotron entry. Keeps the no-model-configured fallback off the priciest flagship while landing on a low-cost, broadly-capable default.	2026-06-05 02:54:34 -07:00
xxxigm	2a82519b0d	fix(models): don't silently default Nous to the most expensive flagship When a provider is configured but no model is selected (e.g. a profile sets provider: nous with no model), the gateway/CLI fall back to get_default_model_for_provider(), which returned the first curated catalog entry. The Nous Portal list is ordered most-capable-first, so entry [0] is anthropic/claude-opus-4.8 — the single most expensive model ($5/$25 per Mtok). A misconfigured profile therefore silently routed every call to the flagship and billed it for traffic the user never opted into. Pin the silent (non-interactive) default for metered aggregators to the cheapest curated tier via _PROVIDER_SILENT_DEFAULT_OVERRIDES so a missing model can never auto-escalate to the flagship. The interactive default (GUI onboarding / `hermes model`) keeps using the richer free/paid-tier-aware resolver. Fixes the unexpected anthropic/claude-opus-4.8 charges reported for a free-tier Nous account whose new profile had no default model.	2026-06-05 02:54:34 -07:00
Teknium	fd87c61078	feat(models): add qwen/qwen3.7-plus to nous + openrouter catalogs (#39409 ) Adds qwen/qwen3.7-plus directly under qwen/qwen3.7-max in both the OpenRouter curated catalog (OPENROUTER_MODELS) and the Nous portal catalog (_PROVIDER_MODELS['nous']), then regenerates the docs-hosted model-catalog.json manifest from those source lists.	2026-06-04 17:29:45 -07:00
rob-maron	54cae7d1cb	switch model order	2026-06-04 17:29:31 -07:00
Brooklyn Nicholson	ea4fe15631	feat(desktop): inline model picker in the status bar Replace the status-bar model chip's modal with a Cursor-style dropdown: - providers grouped by name in a stable order (no recency reshuffle on select) - per-model hover-Edit submenu for reasoning effort + fast, gated by per-model capabilities now surfaced in the model.options payload - unified Fast toggle: flips the speed=fast param where supported, else swaps to the model's `-fast` variant (base and variant collapse into one row) - localStorage-backed "Edit Models" dialog to choose which models appear Adds reusable dropdown primitives (DropdownMenuSearch, shared row/label tokens, portaled + collision-aware submenus) and reads session state from nanostores rather than prop-drilling, so editing options doesn't rebuild and close the menu.	2026-06-02 19:09:41 -05:00
Teknium	79bfddd37c	fix(models): restore gemini-3-flash-preview to Gemini OAuth picker (#37606 ) #37046 swapped gemini-3-flash-preview -> gemini-3.5-flash in the google-gemini-cli (OAuth/Code Assist) picker on the premise that the preview slug was renamed. It wasn't. Per gemini-cli's models.ts, Code Assist serves two distinct flash slugs with different access gates: gemini-3-flash-preview (PREVIEW_GEMINI_FLASH_MODEL — what subscription/ free-tier OAuth users reach) and gemini-3.5-flash (DEFAULT_GEMINI_3_5_FLASH_MODEL — GA-channel-gated). The model string is passed verbatim into the {project, model, ...} envelope sent to cloudcode-pa.googleapis.com, so non-GA users got a hard error on every prompt because gemini-3.5-flash 404s for them. Offer both slugs in the OAuth picker (matching gemini-cli's own /model list) so non-GA users can select the preview flash that works. The gemini (API-key), OpenRouter, and Nous lists are untouched — google/gemini-3.5-flash is a real live model on those surfaces.	2026-06-02 12:49:19 -07:00
Teknium	afea650e16	fix(model-picker): OpenAI shows curated models; OpenRouter no longer phantom-shows (#37404 ) The model picker now matches `hermes model` for OpenAI, and OpenRouter stops appearing as authenticated when only OPENAI_API_KEY is set. - models.py: provider_model_ids() for the default api.openai.com endpoint intersects the live /v1/models dump (120+ entries incl. embeddings, whisper, tts, dall-e, moderation, legacy chat) with the curated agentic list, preserving curated order. Custom OpenAI-compatible endpoints keep the live list verbatim so discovery still works. - providers.py: drop extra_env_vars=("OPENAI_API_KEY",) from the openrouter overlay. list_authenticated_providers reads extra_env_vars to decide whether a provider is authenticated, so any OpenAI user saw a phantom OpenRouter row. Runtime OpenRouter credential resolution still falls back to OPENAI_API_KEY (runtime_provider.py), independent of the overlay. - Regression tests for both paths.	2026-06-02 06:31:37 -07:00
Teknium	e946f49ab5	fix(models): add gemini-3.5-flash to Gemini OAuth + API-key pickers (#37046 ) * fix(file_tools): block agent writes to ~/.hermes/config.yaml to prevent silent approval bypass * fix(approval): pair terminal-side gate for ~/.hermes/config.yaml writes Subway2023's #14639 blocks write_file/patch to ~/.hermes/config.yaml, but the terminal side was only partially paired: echo>/tee/cp/mv to config.yaml already tripped the project-config pattern, while `sed -i` and direct edits slipped through with auto-approve. An unpaired write_file deny is theater per SECURITY.md — the agent could flip approvals.mode=off via `sed -i` and the mtime-keyed config cache reloads it mid-session. config.yaml IS the security policy (approvals.mode/yolo/permanent allowlist live there), so it warrants real pairing, not a half-door. Add a _HERMES_CONFIG_PATH fragment mirroring _HERMES_ENV_PATH, fold it into _SENSITIVE_WRITE_TARGET (covers tee/>/>>/cp/mv), and add sed -i coverage for both config.yaml and .env. Pins 9 regression tests including no-regression guards (reads pass, /tmp writes pass). Co-authored-by: sbw2025 <subw3@mail2.sysu.edu.cn> * chore(release): map Subway2023 for PR #14639 salvage * fix(models): add gemini-3.5-flash to Gemini OAuth + API-key pickers #34581 swapped gemini-3-flash-preview -> gemini-3.5-flash in the OpenRouter and Nous lists but missed the curated Gemini catalogs, so the Google OAuth (google-gemini-cli) picker still offered the retired gemini-3-flash-preview slug and gemini-3.5-flash was unselectable. Per Google's docs gemini-3-flash-preview was renamed to gemini-3.5-flash and is served via Cloud Code Assist, so this completes the rename for: - google-gemini-cli (OAuth/Code Assist) picker - gemini (API-key) picker - gemini provider default_aux_model copilot keeps gemini-3-flash-preview (separate backend, own slug). --------- Co-authored-by: sbw2025 <subw3@mail2.sysu.edu.cn>	2026-06-01 16:31:13 -07:00
Teknium	e3b3d4d75e	feat(models): add MiniMax-M3 to native minimax providers + 1M context (#36214 ) Add MiniMax-M3 to the minimax, minimax-oauth, and minimax-cn curated lists (these are hardcoded — the native Anthropic-format endpoint has no /v1/models listing and the providers aren't in _MODELS_DEV_PREFERRED, so new models don't auto-pull). Add a DEFAULT_CONTEXT_LENGTHS key 'minimax-m3' -> 1,000,000 so M3 resolves to its 1M context on every surface (native ID + OpenRouter/Nous slug) via longest-key-first substring match, while the M2.x series stays at 204,800.	2026-05-31 20:18:05 -07:00
Teknium	a8526a4159	chore(models): bump minimax to minimax-m3 in openrouter + nous lists (#36191 ) Replace minimax/minimax-m2.7 with minimax/minimax-m3 in the OpenRouter fallback snapshot and the Nous portal model list.	2026-05-31 19:24:17 -07:00
kshitijk4poor	84d82453ae	feat(model-picker): show short description on grouped provider rows The 7 consolidated provider families (OpenAI, xAI Grok, GitHub Copilot, Google Gemini, Kimi / Moonshot, MiniMax, OpenCode) collapse to one top-level picker row. Previously that row showed only the bare group label (e.g. `OpenAI ▸`); now it carries a short blurb describing the endpoints folded inside (e.g. `OpenAI ▸ (Codex CLI or direct OpenAI API)`). - models.py: extend PROVIDER_GROUPS tuples to (label, description, members); group_providers() emits the description on group rows. - main.py: CLI picker renders `<label> ▸ (<description>)` for group rows. - telegram.py: update the group tuple unpack (button text keeps the member count, which fits inline keyboards better than a long blurb). - tests: assert every group has a non-empty description and the fold emits it. Member-specific detail still lives in each member's tui_desc and shows in the drill-down sub-picker. Slug identity, --provider, /model paths unchanged.	2026-05-31 15:02:26 -07:00
kshitijk4poor	47d2d05892	chore(model-picker): refresh provider picker descriptions Update the tui_desc text shown for each provider in the interactive `hermes model` / setup wizard / `/model` pickers. Pure copy refresh — slugs, labels, PROVIDER_GROUPS folding, and all typed paths are unchanged, so the 7 grouped families (OpenAI, xAI Grok, GitHub Copilot, Google Gemini, Kimi / Moonshot, MiniMax, OpenCode) still fold identically. Also aligns the auto-injected alibaba-coding-plan provider description to the same parenthetical style.	2026-05-31 15:02:26 -07:00
Teknium	50db2d9c12	feat(models): add deepseek-v4-flash, trim variants, group curated lists by maker (#35659 ) * feat(models): add deepseek-v4-flash to OpenRouter + Nous curated lists deepseek/deepseek-v4-flash was already in the native deepseek provider catalog but missing from the curated OpenRouter and Nous Portal picker lists. Added it to both and regenerated the model-catalog.json manifest (drift guard requires same-PR regeneration). * refactor(models): trim redundant variants, group curated lists by maker Remove claude-opus-4.7/4.6, gpt-5.4-nano, gpt-5.3-codex, gemini-3-pro-image-preview, gemini-3.1-flash-lite-preview, grok-4.20, and the older gemini-3-pro-preview (Nous). Reorder both OPENROUTER_MODELS and _PROVIDER_MODELS[nous] into contiguous per-maker blocks with comment headers. Regenerated model-catalog.json (openrouter 27, nous 20). * feat(models): add gemini-3-pro-preview to OpenRouter + Nous curated lists Adds google/gemini-3-pro-preview to both curated pickers (new on OpenRouter, restored on Nous). Regenerated model-catalog.json (openrouter 28, nous 21). * test(models): use claude-opus-4.8 in OpenRouter fetch fixtures The two TestFetchOpenRouterModels tests mocked a live OpenRouter response with claude-opus-4.6 and relied on it surviving the curated-list filter. Since 4.6 was removed from OPENROUTER_MODELS, those models got filtered out and the recommended tag shifted. Swap the fixture to claude-opus-4.8 (still curated, still first in the Anthropic block).	2026-05-30 20:57:01 -07:00
Teknium	93e6a05efc	feat(model-picker): group multi-endpoint providers under one row (#35227 ) * Inspired by Claude Code: /compress here [N] — boundary-aware 'summarize up to here' Adds a user-chosen compression boundary to the existing /compress command. /compress here [N] summarizes everything except the most recent N exchanges (default 2), which are preserved verbatim — letting the user pick the compression boundary instead of relying on the automatic token-budget heuristic. Inspired by Claude Code's Rewind 'Summarize up to here' action (v2.1.139, Week 20, May 2026): https://code.claude.com/docs/en/whats-new/2026-w20 - hermes_cli/partial_compress.py: pure split/parse helpers + seam-alternation guard (shared by CLI and gateway). - cli.py / gateway/run.py: route 'here [N]' / '--keep N' to partial compression; compress only the head, re-append the verbatim tail through the seam guard. - Preserves message-flow role alternation (seam guard merges any illegal user->user / assistant->assistant adjacency). - Reuses the existing _compress_context session-rotation/lock machinery — no changes to the compression core. - Bare /compress (full) and /compress <focus> behavior unchanged. Tests: 12 helper unit tests + 5 CLI integration tests + E2E (interleaved tool-call transcript, degenerate/multimodal seams, real handler path). * feat(model-picker): group multi-endpoint providers under one row The interactive provider pickers (hermes model, setup wizard, Telegram /model) listed every provider slug flat, so vendors with several endpoints (Kimi/Moonshot, MiniMax, xAI Grok, Google Gemini, OpenAI, OpenCode, GitHub Copilot) each occupied multiple top-level rows. Now related slugs fold into one top-level row that drills down to the specific endpoint. - models.py: add PROVIDER_GROUPS table + group_providers() fold (display only — CANONICAL_PROVIDERS, slugs, --provider, /model <provider:model> all unchanged and individually addressable). - hermes model (main.py): group rows drill into a member sub-picker, then dispatch to the existing _model_flow_* unchanged. setup wizard inherits it. - Telegram /model: new mpg:<group> callback expands to member mp:<slug> buttons; single authenticated member degrades to a direct button. - Grouping is the single shared fold across all three surfaces. Validation: 163 targeted tests pass; E2E confirms group->member->model resolves to the correct concrete slug for all families.	2026-05-30 01:41:33 -07:00
Teknium	5e7c2ffa9f	chore(models): gemini-3.5-flash replaces gemini-3-flash-preview in OpenRouter + Nous lists (#34581 ) * chore(models): swap gemini-3-flash-preview for gemini-3.5-flash in OpenRouter + Nous lists * chore(models): regenerate model-catalog.json for gemini-3.5-flash swap	2026-05-29 04:27:58 -07:00
teknium1	ddaf2f6712	style: restore PEP8 blank-line separation after dead-code removal The deletions in the salvaged commit left some top-level defs/classes separated by a single blank line. Restore the 2-blank-line separation.	2026-05-29 04:22:27 -07:00
kshitijk4poor	dc235e93cb	chore: remove dead code — 28 unused functions/classes across 16 files Vulture + per-symbol verification (whole-repo grep incl. tests, string literals, getattr, decorator/registry/argparse dispatch) confirmed each of these has zero callers anywhere — not reachable via any dynamic-dispatch path, not referenced by tests, not re-exported. Removed: - acp_adapter/tools.py: _build_patch_mode_content - agent/anthropic_adapter.py: read_claude_managed_key (diagnostics-only, never called) - agent/bedrock_adapter.py: get_bedrock_model_ids - agent/browser_registry.py: get_active_browser_provider - agent/chat_completion_helpers.py: _take_request_client (x2 nested closures, never invoked) - gateway/platforms/weixin.py: _rewrite_headers_for_weixin, _rewrite_table_block_for_weixin - hermes_cli/banner.py: _skin_branding - hermes_cli/debug.py: _delete_hint - hermes_cli/gateway.py: _setup_email, _setup_sms, _setup_yuanbao (platform keys absent from the _builtin_setup_fn dispatch dict; handled by the _setup_standard_platform fallback) - hermes_cli/kanban_db.py: set_max_runtime, active_run - hermes_cli/kanban_diagnostics.py: severity_of_highest, _latest_clean_event_ts - hermes_cli/main.py: _build_provider_choices, cmd_portal (portal subcommand is wired via portal_cli.add_parser, not this wrapper) - hermes_cli/model_switch.py: CustomAutoResult (orphaned by the switch_model() extraction) - hermes_cli/models.py: format_model_pricing_table, fetch_nous_account_tier - hermes_cli/portal_cli.py: _nous_portal_base_url - hermes_cli/proxy/server.py: handle_models_fallback (defined but never registered on the router) - tools/computer_use/cua_backend.py: _parse_element, _is_arm_mac - tools/file_operations.py: _get_safe_write_root (prod uses the imported agent.file_safety.get_safe_write_root directly) - tools/skills_tool.py: _load_category_description Also dropped two imports left unused by the removals: - tools/file_operations.py: get_safe_write_root alias - tools/computer_use/cua_backend.py: import platform Pure deletion: -551 LOC. No behavior change. Test files covering the edited modules pass (640/640); the broader suite's pre-existing/env-dependent failures reproduce unchanged on origin/main.	2026-05-29 04:22:27 -07:00
teknium1	f2d88c820c	fix(model-catalog): fall through to raw.github when Vercel 403s; swap step-3.5-flash for step-3.7-flash on OpenRouter+Nous The docs site (Vercel) serves /docs/api/model-catalog.json behind a bot mitigation rule that returns HTTP 403 + x-vercel-mitigated: challenge for non-browser User-Agents — including urllib (what the CLI uses) and curl. When that happens, get_catalog() falls back to the stale disk cache and new model releases (Opus 4.8, etc.) never reach the /model picker even though they're already in OPENROUTER_MODELS and the live OpenRouter API. Adds a fallback URL chain: when the primary catalog URL fails, walk DEFAULT_CATALOG_FALLBACK_URLS — currently the raw.githubusercontent.com copy of the same file. GitHub raw doesn't bot-gate, so the manifest stays reachable through Vercel firewall hiccups. Per-provider override URLs keep their direct-fetch semantics (operators configure those specifically, no implicit fallback). Also swaps stepfun/step-3.5-flash for stepfun/step-3.7-flash in the OpenRouter + Nous Portal curated picker lists. Native stepfun provider configuration (api.stepfun.ai) is left alone — that depends on what stepfun.ai itself serves, not what OpenRouter routes. Test plan: 5 new TestFallbackChain tests cover primary-success, primary-failure-fallback-success, all-fail, primary==fallback-dedup, and end-to-end get_catalog routing through the new helper. Existing 23 tests in test_model_catalog.py still pass (28 total). Wider tests/hermes_cli/ sweep: 5701/5701 pass.	2026-05-29 00:25:36 -07:00

1 2 3 4 5 ...

274 commits