When a /model switch resolves a valid model but the in-place agent swap
fails mid-conversation (expired key, unreachable base_url), the agent
rolls itself back to the old working model+client and re-raises. The
callers caught that re-raise, logged a warning, then committed the broken
switch anyway: wrote the failed model to the session DB, set
_session_model_overrides to the broken model/provider/key, and (gateway
direct path) evicted the working cached agent. The next message then
rebuilt a dead agent from the broken override -> permanently unusable
conversation (#50163).
Fix the whole caller class so a failed swap aborts the commit entirely:
- gateway/slash_commands.py (picker + direct /model paths): on swap
failure, early-return an error message; skip DB persist, session
override, cache eviction, and config write.
- cli.py (both /model handlers): snapshot CLI-level credential/runtime
fields before mutating, restore them on swap failure, and abort the
note + success print.
- tui_gateway/server.py: wrap the previously-unguarded swap; on failure
raise a clean error and skip worker restart, runtime persist, switch
marker, session model_override, and config persist.
The no-cached-agent path (apply-on-next-session) is unaffected.
Adds a gateway regression test that fails on the pre-fix behavior.