fix(browser): narrow credential-query denylist to unambiguous names

Follow-up on the salvaged #49830 hardening. The contributor's sensitive
query-param set included bare English words (code, key, auth, session,
sig) that double as ordinary page facets — ?code= on promo/challenge
pages, ?key= as a search facet, ?session= on blogs — so web_extract and
cloud browser_navigate would refuse a large slice of normal browsing.

Narrow the set to unambiguously credential-named params (access_token,
authorization, client_secret, password, token, x-amz-signature, ...).
Prefix-based vendor-key redaction (is_safe_url) still catches recognizable
key shapes; this set is the belt-and-suspenders for opaque secrets carried
under an explicit credential-named parameter.

Also fixes two intra-PR-staleness test breakages surfaced by salvaging onto
current main:
- web_extract_tool() no longer accepts use_llm_processing= (signature
  changed since the PR was authored) — dropped the invalid kwarg.
- agent.redact now fully masks keyed 'token=<secret>' to 'token=***'
  instead of partial 'sk-...'; the console-redaction test now asserts the
  real invariant (secret body gone) rather than the exact mask format.

Added a regression test that generic English-word query params are NOT
blocked by the credential guard.
This commit is contained in:
teknium1 2026-07-01 04:47:58 -07:00 committed by Teknium
parent 937e56be92
commit cfbc7ed1f9
4 changed files with 39 additions and 10 deletions

View file

@ -77,26 +77,28 @@ def normalize_url_for_request(url: str) -> str:
return urlunsplit((parsed.scheme, netloc, path, query, fragment))
# Query parameter names that are unambiguously credential-bearing. Kept
# deliberately narrow: bare English words that double as normal page facets
# (``code`` on promo/challenge pages, ``key``/``auth``/``session``/``sig`` as
# search or routing params) are intentionally EXCLUDED to avoid blocking
# ordinary browsing. Prefix-based token redaction (``is_safe_url``) still
# catches recognizable vendor key shapes; this set is the belt-and-suspenders
# for opaque secrets that carry an explicit credential-named parameter.
_SENSITIVE_QUERY_PARAM_NAMES = frozenset({
"access_token",
"api_key",
"apikey",
"auth",
"auth_token",
"authorization",
"awsaccesskeyid",
"client_secret",
"code",
"credential",
"credentials",
"key",
"jwt",
"password",
"passwd",
"secret",
"session",
"session_id",
"sig",
"signature",
"token",
"x_amz_security_token",