Rework follow-up on the per-job TERMINAL_CWD readers-writer lock. The lock was acquired BEFORE the try: whose finally: is the only release site, with the env-override statements (os.environ[TERMINAL_CWD] = workdir; logger.info) sitting in the unprotected window between acquire and try. Any exception there — a raising log handler, an os.environ error, a thread interrupt — propagated out of run_job WITHOUT running the finally, leaking the lock. A leaked writer permanently deadlocks the whole scheduler (every future cron job blocks on acquire_*); a leaked reader blocks all writers. - Snapshot _prior_terminal_cwd before the acquire (so the finally can always restore env even if the body raises before the override). - Open the try: immediately after acquire and move the env-override lines inside it, so the existing finally always releases the lock. - Add a mutation-verified regression test: a workdir job whose in-window logger.info raises must still release the writer lock (a subsequent acquire_write must not block). |
||
|---|---|---|
| .. | ||
| scripts | ||
| __init__.py | ||
| blueprint_catalog.py | ||
| jobs.py | ||
| lifecycle_guard.py | ||
| scheduler.py | ||
| scheduler_provider.py | ||
| suggestion_catalog.py | ||
| suggestions.py | ||