Add two more tests to the catalog in
`conc-anal/subint_sigint_starvation_issue.md` — same
signal-wakeup-fd-saturation fingerprint (abandoned legacy-subint driver
threads → shared-GIL starvation → `write() = EAGAIN` on the wakeup pipe
→ silent SIGINT drop), different load patterns.
Deats,
- `test_cancel_while_childs_child_in_sync_sleep[subint-False]`: nested
actor-tree + sync-sleeping grandchild. Under `trio`/`mp_*` the "zombie
reaper" is a subproc `SIGKILL`; no equivalent exists under subint, so
the grandchild persists in its abandoned driver thread. Often only
manifests under full-suite runs (earlier tests seed the
abandoned-thread pool).
- `test_multierror_fast_nursery[subint-25-0.5]`: 25 concurrent subactors
all go through teardown on the multierror. Bounded hard-kills run in
parallel — so the total budget is ~3s, not 3s × 25. Leaves 25
abandoned driver threads simultaneously alive, an extreme pressure
multiplier. `strace` shows several successful `write(16, "\2", 1) = 1`
(GIL round-robin IS giving main brief slices) before finally
saturating with `EAGAIN`.
Also include a `pstree -snapt <pid>` capture showing
16+ live `{subint-driver[<interp_id>}` threads at the
moment of hang — the direct GIL-contender population.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Classify and write up the two distinct hang modes hit during Phase
B subint bringup (issue #379) so future triage doesn't re-derive them
from scratch.
Deats, two new `ai/conc-anal/` docs,
- `subint_sigint_starvation_issue.md`: abandoned legacy-subint thread
+ shared GIL → main trio loop starves → signal-wakeup-fd pipe fills
→ `SIGINT` silently dropped (`strace` shows `write() = EAGAIN` on the
wakeup-fd). Un- Ctrl-C-able. Structurally a CPython limit; blocked on
`msgspec` PEP 684 (jcrist/msgspec#563)
- `subint_cancel_delivery_hang_issue.md`: parent-side trio task parks on
an orphaned IPC channel after subint teardown — no clean EOF delivered
to the waiting receive. Ctrl-C-able (main loop iterates fine); OUR bug
to fix. Candidate fix: explicit parent-side channel abort in
`subint_proc`'s hard-kill teardown
Cross-link the docs from their test reproducers,
- `test_stale_entry_is_deleted` (→ starvation class): wrap
`trio.run(main)` in `dump_on_hang(seconds=20)` so a future regression
captures a stack dump. Kept un- skipped so the dump file is
inspectable
- `test_subint_non_checkpointing_child` (→ delivery class): extend
docstring with a "KNOWN ISSUE" block pointing at the analysis
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code