Doc ruled-out fix + capture-pipe aside
Two new sections in `subint_forkserver_test_cancellation_leak_issue.md` documenting continued investigation of the `test_nested_multierrors[subint_forkserver]` peer- channel-loop hang: 1. **"Attempted fix (DID NOT work) — hypothesis (3)"**: tried sync-closing peer channels' raw socket fds from `_serve_ipc_eps`'s finally block (iterate `server._peers`, `_chan._transport. stream.socket.close()`). Theory was that sync close would propagate as `EBADF` / `ClosedResourceError` into the stuck `recv_some()` and unblock it. Result: identical hang. Either trio holds an internal fd reference that survives external close, or the stuck recv isn't even the root blocker. Either way: ruled out, experiment reverted, skip-mark restored. 2. **"Aside: `-s` flag changes behavior for peer- intensive tests"**: noticed `test_context_stream_semantics.py` under `subint_forkserver` hangs with default `--capture=fd` but passes with `-s` (`--capture=no`). Working hypothesis: subactors inherit pytest's capture pipe (fds 1,2 — which `_close_inherited_fds` deliberately preserves); verbose subactor logging fills the buffer, writes block, deadlock. Fix direction (if confirmed): redirect subactor stdout/stderr to `/dev/null` or a file in `_actor_child_main`. Not a blocker on the main investigation; deserves its own mini-tracker. Both sections are diagnosis-only — no code changes in this commit. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-codesubint_forkserver_backend
parent
76d12060aa
commit
7cd47ef7fb
|
|
@ -395,6 +395,62 @@ Candidate follow-up experiments:
|
||||||
re-raise means it should still exit. Unless
|
re-raise means it should still exit. Unless
|
||||||
something higher up swallows it.
|
something higher up swallows it.
|
||||||
|
|
||||||
|
### Attempted fix (DID NOT work) — hypothesis (3)
|
||||||
|
|
||||||
|
Tried: in `_serve_ipc_eps` finally, after closing
|
||||||
|
listeners, also iterate `server._peers` and
|
||||||
|
sync-close each peer channel's underlying stream
|
||||||
|
socket fd:
|
||||||
|
|
||||||
|
```python
|
||||||
|
for _uid, _chans in list(server._peers.items()):
|
||||||
|
for _chan in _chans:
|
||||||
|
try:
|
||||||
|
_stream = _chan._transport.stream if _chan._transport else None
|
||||||
|
if _stream is not None:
|
||||||
|
_stream.socket.close() # sync fd close
|
||||||
|
except (AttributeError, OSError):
|
||||||
|
pass
|
||||||
|
```
|
||||||
|
|
||||||
|
Theory: closing the socket fd from outside the stuck
|
||||||
|
recv task would make the recv see EBADF /
|
||||||
|
ClosedResourceError and unblock.
|
||||||
|
|
||||||
|
Result: `test_nested_multierrors[subint_forkserver]`
|
||||||
|
still hangs identically. Either:
|
||||||
|
- The sync `socket.close()` doesn't propagate into
|
||||||
|
trio's in-flight `recv_some()` the way I expected
|
||||||
|
(trio may hold an internal reference that keeps the
|
||||||
|
fd open even after an external close), or
|
||||||
|
- The stuck recv isn't even the root blocker and the
|
||||||
|
peer handlers never reach the finally for some
|
||||||
|
reason I haven't understood yet.
|
||||||
|
|
||||||
|
Either way, the sync-close hypothesis is **ruled
|
||||||
|
out**. Reverted the experiment, restored the skip-
|
||||||
|
mark on the test.
|
||||||
|
|
||||||
|
### Aside: `-s` flag changes behavior for peer-intensive tests
|
||||||
|
|
||||||
|
While exploring, noticed
|
||||||
|
`tests/test_context_stream_semantics.py` under
|
||||||
|
`--spawn-backend=subint_forkserver` hangs with
|
||||||
|
pytest's default `--capture=fd` but passes with
|
||||||
|
`-s` (`--capture=no`). Hypothesis (unverified): fork
|
||||||
|
children inherit pytest's capture pipe for stdout/
|
||||||
|
stderr (fds 1,2 — we preserve these in
|
||||||
|
`_close_inherited_fds`). When subactor logging is
|
||||||
|
verbose, the capture pipe buffer fills, writes block,
|
||||||
|
child can't progress, deadlock.
|
||||||
|
|
||||||
|
If confirmed, fix direction: redirect subactor
|
||||||
|
stdout/stderr to `/dev/null` (or a file) in
|
||||||
|
`_actor_child_main` so subactors don't hold pytest's
|
||||||
|
capture pipe open. Not a blocker on the main
|
||||||
|
peer-chan-loop investigation; deserves its own mini-
|
||||||
|
tracker.
|
||||||
|
|
||||||
## Stopgap (landed)
|
## Stopgap (landed)
|
||||||
|
|
||||||
`test_nested_multierrors` skip-marked under
|
`test_nested_multierrors` skip-marked under
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue