--- model: claude-opus-4-7[1m] service: claude session: subints-phase-b-hardening-and-fork-block timestamp: 2026-04-22T20:07:23Z git_ref: 797f57c scope: code substantive: true raw_file: 20260422T200723Z_797f57c_prompt_io.raw.md --- ## Prompt Session-spanning work on the Phase B `subint` spawn-backend. Three distinct sub-phases in one log: 1. **Py3.13 gate tightening** — diagnose a reproducible hang of subint spawn flow under py3.13 (works on py3.14), trace to a private `_interpreters` module vintage issue, tighten our feature gate from "`_interpreters` present" to "public `concurrent.interpreters` present" (i.e. py3.14+). 2. **Test-harness hardening** — add `pytest-timeout` dep, put `@pytest.mark.timeout(30, method='thread')` on the three known-hanging subint tests cataloged in `ai/conc-anal/subint_sigint_starvation_issue.md`. Separately, code-review the user's in-flight `skipon_spawn_backend` marker implementation; find four bugs; refactor to use `item.iter_markers()`. 3. **`subint_fork` prototype → CPython-block finding** — draft a WIP `subint_fork_proc` backend using a sub-interpreter as a launchpad for `os.fork()` (to sidestep trio#1614). User tests on py3.14, hits `Fatal Python error: _PyInterpreterState_DeleteExceptMain: not main interpreter`. Walk CPython sources (local clone at `~/repos/cpython/`) to pinpoint the refusal (`Modules/posixmodule.c:728` → `Python/pystate.c:1040`). Revert implementation to a `NotImplementedError` stub in a new `_subint_fork.py` submodule, document the finding in a third `conc-anal/` doc with an upstream-report draft for the CPython issue tracker. Finally, discuss user's proposed workaround architecture (main-interp worker-thread forkserver) and draft a standalone smoke-test script for feasibility validation. ## Response summary All three sub-phases landed concrete artifacts: **Sub-phase 1** — `_subint.py` + `_spawn.py` gates + error messages updated to require py3.14+ via the public `concurrent.interpreters` module presence check. Module docstring revised to explain the empirical reason (py3.13's private `_interpreters` vintage wedges under multi-trio-task usage even though minimal standalone reproducers work fine there). Test-module `pytest.importorskip` likewise switched. **Sub-phase 2** — `pytest-timeout>=2.3` added to `testing` dep group. `@pytest.mark.timeout(30, method='thread')` applied on: - `tests/discovery/test_registrar.py::test_stale_entry_is_deleted` - `tests/test_cancellation.py::test_cancel_while_childs_child_in_sync_sleep` - `tests/test_cancellation.py::test_multierror_fast_nursery` - `tests/test_subint_cancellation.py::test_subint_non_checkpointing_child` `method='thread'` documented inline as load-bearing — the GIL-starvation path that drops `SIGINT` would equally drop `SIGALRM`, so only a watchdog-thread timeout can reliably escape. `skipon_spawn_backend` plugin refactored into a single `iter_markers`-driven loop in `pytest_collection_modifyitems` (~30 LOC replacing ~30 LOC of nested conditionals). Four bugs dissolved: wrong `.get()` key, module-level `pytestmark` suppressing per-test marks, unhandled `pytestmark = [list]` form, `pytest.Makr` typo. Marker help text updated to document the variadic backend-list + `reason=` kwarg surface. **Sub-phase 3** — Prototype drafted (then reverted): - `tractor/spawn/_subint_fork.py` — new dedicated submodule housing the `subint_fork_proc` stub. Module docstring + fn docstring explain the attempt, the CPython-level block, and the reason for keeping the stub in-tree (documentation of the attempt + starting point if CPython ever lifts the restriction). - `tractor/spawn/_spawn.py` — `'subint_fork'` registered as a `SpawnMethodKey` literal + in `_methods`, so `--spawn-backend=subint_fork` routes to a clean `NotImplementedError` pointing at the analysis doc rather than an "invalid backend" error. - `ai/conc-anal/subint_fork_blocked_by_cpython_post_fork_issue.md` — third sibling conc-anal doc. Full annotated CPython source walkthrough from user-visible `Fatal Python error` → `Modules/posixmodule.c:728 PyOS_AfterFork_Child()` → `Python/pystate.c:1040 _PyInterpreterState_DeleteExceptMain()` gate. Includes a copy-paste-ready upstream-report draft for the CPython issue tracker with a two-tier ask (ideally "make it work", minimally "cleaner error than `Fatal Python error` aborting the child"). - `ai/conc-anal/subint_fork_from_main_thread_smoketest.py` — standalone zero-tractor-import CPython-level smoke test for the user's proposed workaround architecture (forkserver on a main-interp worker thread). Four argparse-driven scenarios: `control_subint_thread_fork` (reproduces the known-broken case as a test-harness sanity), `main_thread_fork` (baseline), `worker_thread_fork` (architectural assertion), `full_architecture` (end-to-end trio-in-subint in forked child). User will run on py3.14 next. ## Files changed See `git log 26fb820..HEAD --stat` for the canonical list. New files this session: - `tractor/spawn/_subint_fork.py` - `ai/conc-anal/subint_fork_blocked_by_cpython_post_fork_issue.md` - `ai/conc-anal/subint_fork_from_main_thread_smoketest.py` Modified (diff pointers in raw log): - `tractor/spawn/_subint.py` (py3.14 gate) - `tractor/spawn/_spawn.py` (`subint_fork` registration) - `tractor/_testing/pytest.py` (`skipon_spawn_backend` refactor) - `pyproject.toml` (`pytest-timeout` dep) - `tests/discovery/test_registrar.py`, `tests/test_cancellation.py`, `tests/test_subint_cancellation.py` (timeout marks, cross-refs to conc-anal docs) ## Human edits Several back-and-forth iterations with user-driven adjustments during the session: - User corrected my initial mis-classification of `test_cancel_while_childs_child_in_sync_sleep[subint-False]` as Ctrl-C-able — second strace showed `EAGAIN`, putting it squarely in class A (GIL-starvation). Re-analysis preserved in the raw log. - User independently fixed the `.get(reason)` → `.get('reason', reason)` bug in the marker plugin before my review; preserved their fix. - User suggested moving the `subint_fork_proc` stub from the bottom of `_subint.py` into its own `_subint_fork.py` submodule — applied. - User asked to keep the forkserver-architecture discussion as background for the smoke-test rather than committing to a tractor-side refactor until the smoke test validates the CPython-level assumptions. Commit messages in this range (b025c982 … 797f57c) were drafted via `/commit-msg` + `rewrap.py --width 67`; user landed them with the usual review.