Doc ruled-out fix + capture-pipe aside
Two new sections in `subint_forkserver_test_cancellation_leak_issue.md` documenting continued investigation of the `test_nested_multierrors[subint_forkserver]` peer- channel-loop hang: 1. **"Attempted fix (DID NOT work) — hypothesis (3)"**: tried sync-closing peer channels' raw socket fds from `_serve_ipc_eps`'s finally block (iterate `server._peers`, `_chan._transport. stream.socket.close()`). Theory was that sync close would propagate as `EBADF` / `ClosedResourceError` into the stuck `recv_some()` and unblock it. Result: identical hang. Either trio holds an internal fd reference that survives external close, or the stuck recv isn't even the root blocker. Either way: ruled out, experiment reverted, skip-mark restored. 2. **"Aside: `-s` flag changes behavior for peer- intensive tests"**: noticed `test_context_stream_semantics.py` under `subint_forkserver` hangs with default `--capture=fd` but passes with `-s` (`--capture=no`). Working hypothesis: subactors inherit pytest's capture pipe (fds 1,2 — which `_close_inherited_fds` deliberately preserves); verbose subactor logging fills the buffer, writes block, deadlock. Fix direction (if confirmed): redirect subactor stdout/stderr to `/dev/null` or a file in `_actor_child_main`. Not a blocker on the main investigation; deserves its own mini-tracker. Both sections are diagnosis-only — no code changes in this commit. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-codesubint_forkserver_backend
parent
76d12060aa
commit
7cd47ef7fb
|
|
@ -395,6 +395,62 @@ Candidate follow-up experiments:
|
|||
re-raise means it should still exit. Unless
|
||||
something higher up swallows it.
|
||||
|
||||
### Attempted fix (DID NOT work) — hypothesis (3)
|
||||
|
||||
Tried: in `_serve_ipc_eps` finally, after closing
|
||||
listeners, also iterate `server._peers` and
|
||||
sync-close each peer channel's underlying stream
|
||||
socket fd:
|
||||
|
||||
```python
|
||||
for _uid, _chans in list(server._peers.items()):
|
||||
for _chan in _chans:
|
||||
try:
|
||||
_stream = _chan._transport.stream if _chan._transport else None
|
||||
if _stream is not None:
|
||||
_stream.socket.close() # sync fd close
|
||||
except (AttributeError, OSError):
|
||||
pass
|
||||
```
|
||||
|
||||
Theory: closing the socket fd from outside the stuck
|
||||
recv task would make the recv see EBADF /
|
||||
ClosedResourceError and unblock.
|
||||
|
||||
Result: `test_nested_multierrors[subint_forkserver]`
|
||||
still hangs identically. Either:
|
||||
- The sync `socket.close()` doesn't propagate into
|
||||
trio's in-flight `recv_some()` the way I expected
|
||||
(trio may hold an internal reference that keeps the
|
||||
fd open even after an external close), or
|
||||
- The stuck recv isn't even the root blocker and the
|
||||
peer handlers never reach the finally for some
|
||||
reason I haven't understood yet.
|
||||
|
||||
Either way, the sync-close hypothesis is **ruled
|
||||
out**. Reverted the experiment, restored the skip-
|
||||
mark on the test.
|
||||
|
||||
### Aside: `-s` flag changes behavior for peer-intensive tests
|
||||
|
||||
While exploring, noticed
|
||||
`tests/test_context_stream_semantics.py` under
|
||||
`--spawn-backend=subint_forkserver` hangs with
|
||||
pytest's default `--capture=fd` but passes with
|
||||
`-s` (`--capture=no`). Hypothesis (unverified): fork
|
||||
children inherit pytest's capture pipe for stdout/
|
||||
stderr (fds 1,2 — we preserve these in
|
||||
`_close_inherited_fds`). When subactor logging is
|
||||
verbose, the capture pipe buffer fills, writes block,
|
||||
child can't progress, deadlock.
|
||||
|
||||
If confirmed, fix direction: redirect subactor
|
||||
stdout/stderr to `/dev/null` (or a file) in
|
||||
`_actor_child_main` so subactors don't hold pytest's
|
||||
capture pipe open. Not a blocker on the main
|
||||
peer-chan-loop investigation; deserves its own mini-
|
||||
tracker.
|
||||
|
||||
## Stopgap (landed)
|
||||
|
||||
`test_nested_multierrors` skip-marked under
|
||||
|
|
|
|||
Loading…
Reference in New Issue