Compare commits

...

5 Commits

Author SHA1 Message Date
Gud Boi 4425023500 Label forkserver child as `subint_forkserver`
Follow-up to 72d1b901 (was prev commit adding `debug_mode` for
`subint_forkserver`): that commit wired the runtime-side
`subint_forkserver` SpawnSpec-recv gate in `Actor._from_parent`, but the
`subint_forkserver_proc` child-target was still passing
`spawn_method='trio'` to `_trio_main` — so `Actor.pformat()` / log lines
would report the subactor as plain `'trio'` instead of the actual
parent-side spawn mechanism. Flip the label to `'subint_forkserver'`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 11:43:06 -04:00
Gud Boi 72d1b9016a Enable `debug_mode` for `subint_forkserver`
The `subint_forkserver` backend's child runtime is trio-native (uses
`_trio_main` + receives `SpawnSpec` over IPC just like `trio`/`subint`),
so `tractor.devx.debug._tty_lock` works in those subactors. Wire the
runtime gates that historically hard-coded `_spawn_method == 'trio'` to
recognize this third backend.

Deats,
- new `_DEBUG_COMPATIBLE_BACKENDS` module-const in `tractor._root`
  listing the spawn backends whose subactor runtime is trio-native
  (`'trio'`, `'subint_forkserver'`). Both the enable-site
  (`_runtime_vars['_debug_mode'] = True`) and the cleanup-site reset
  key.
  off the same tuple — keep them in lockstep when adding backends
- `open_root_actor`'s `RuntimeError` for unsupported backends now
  reports the full compatible-set + the rejected method instead of the
  stale "only `trio`" msg.
- `runtime._runtime.Actor._from_parent`'s SpawnSpec-recv gate adds
  `'subint_forkserver'` to the existing `('trio', 'subint')` tuple
  — fork child-side runtime receives the same SpawnSpec IPC handshake as
  the others.
- `subint_forkserver_proc` child-target now passes
  `spawn_method='subint_forkserver'` (was hard-coded `'trio'`) so
  `Actor.pformat()` / log lines reflect the actual parent-side spawn
  mechanism rather than masquerading as plain `trio`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 11:39:42 -04:00
Gud Boi 4227eabcba Drop unneeded f-str prefixes 2026-04-23 11:04:10 -04:00
Gud Boi af94080b21 Shorten some timeouts in `subint_forkserver` suites 2026-04-23 11:01:56 -04:00
Gud Boi 36cca96385 Refine `subint_forkserver` orphan-SIGINT diagnosis
Empirical follow-up to the xfail'd orphan-SIGINT test:
the hang is **not** "trio can't install a handler on a
non-main thread" (the original hypothesis from the
`child_sigint` scaffold commit). On py3.14:

- `threading.current_thread() is threading.main_thread()`
  IS True post-fork — CPython re-designates the
  fork-inheriting thread as "main" correctly
- trio's `KIManager` SIGINT handler IS installed in the
  subactor (`signal.getsignal(SIGINT)` confirms)
- the kernel DOES deliver SIGINT to the thread

But `faulthandler` dumps show the subactor wedged in
`trio/_core/_io_epoll.py::get_events` — trio's
wakeup-fd mechanism (which turns SIGINT into an epoll-wake)
isn't firing. So the `except KeyboardInterrupt` at
`tractor/spawn/_entry.py::_trio_main:164` — the runtime's
intentional "KBI-as-OS-cancel" path — never fires.

Deats,
- new `ai/conc-anal/subint_forkserver_orphan_sigint_hang_issue.md`
  (+385 LOC): full writeup — TL;DR, symptom reproducer,
  the "intentional cancel path" the bug defeats,
  diagnostic evidence (`faulthandler` output +
  `getsignal` probe), ruled-out hypotheses
  (non-main-thread issue, wakeup-fd inheritance,
  KBI-as-trio-check-exception), and fix directions
- `test_orphaned_subactor_sigint_cleanup_DRAFT` xfail
  `reason` + test docstring rewritten to match the
  refined understanding — old wording blamed the
  non-main-thread path, new wording points at the
  `epoll_wait` wedge + cross-refs the new conc-anal doc
- `_subint_forkserver` module docstring's
  `child_sigint='trio'` bullet updated: now notes trio's
  handler is already correctly installed, so the flag may
  end up a no-op / doc-only mode once the real root cause
  is fixed

Closing the gap aligns with existing design intent (make
the already-designed "KBI-as-OS-cancel" behavior actually
fire), not a new feature.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-23 09:31:32 -04:00
5 changed files with 478 additions and 41 deletions

View File

@ -0,0 +1,385 @@
# `subint_forkserver` backend: orphaned-subactor SIGINT wedged in `epoll_wait`
Follow-up to the Phase C `subint_forkserver` spawn-backend
PR (see `tractor.spawn._subint_forkserver`, issue #379).
Surfaced by the xfail'd
`tests/spawn/test_subint_forkserver.py::test_orphaned_subactor_sigint_cleanup_DRAFT`.
Related-but-distinct from
`subint_cancel_delivery_hang_issue.md` (orphaned-channel
park AFTER subint teardown) and
`subint_sigint_starvation_issue.md` (GIL-starvation,
SIGINT never delivered): here the SIGINT IS delivered,
trio's handler IS installed, but trio's event loop never
wakes — so the KBI-at-checkpoint → `_trio_main` catch path
(which is the runtime's *intentional* OS-cancel design)
never fires.
## TL;DR
When a `subint_forkserver`-spawned subactor is orphaned
(parent `SIGKILL`'d, no IPC cancel path available) and then
externally `SIGINT`'d, the subactor hangs in
`trio/_core/_io_epoll.py::get_events` (epoll_wait)
indefinitely — even though:
1. `threading.current_thread() is threading.main_thread()`
post-fork (CPython 3.14 re-designates correctly).
2. Trio's SIGINT handler IS installed in the subactor
(`signal.getsignal(SIGINT)` returns
`<function KIManager.install.<locals>.handler at 0x...>`).
3. The kernel does deliver SIGINT — the signal arrives at
the only thread in the process (the fork-inherited
worker which IS now "main" per Python).
Yet `epoll_wait` does not return. Trio's wakeup-fd mechanism
— the machinery that turns SIGINT into an epoll-wake — is
somehow not firing the wakeup. Until that's fixed, the
intentional "KBI-as-OS-cancel" path in
`tractor/spawn/_entry.py::_trio_main:164` is unreachable
for forkserver-spawned subactors whose parent dies.
## Symptom
Test: `tests/spawn/test_subint_forkserver.py::test_orphaned_subactor_sigint_cleanup_DRAFT`
(currently marked `@pytest.mark.xfail(strict=True)`).
1. Harness subprocess brings up a tractor root actor +
one `run_in_actor(_sleep_forever)` subactor via
`try_set_start_method('subint_forkserver')`.
2. Harness prints `CHILD_PID` (subactor) and
`PARENT_READY` (root actor) markers to stdout.
3. Test `os.kill(parent_pid, SIGKILL)` + `proc.wait()`
to fully reap the root-actor harness.
4. Child (now reparented to pid 1) is still alive.
5. Test `os.kill(child_pid, SIGINT)` and polls
`os.kill(child_pid, 0)` for up to 10s.
6. **Observed**: the child is still alive at deadline —
SIGINT did not unwedge the trio loop.
## What the "intentional" cancel path IS
`tractor/spawn/_entry.py::_trio_main:157-186`
```python
try:
if infect_asyncio:
actor._infected_aio = True
run_as_asyncio_guest(trio_main)
else:
trio.run(trio_main)
except KeyboardInterrupt:
logmeth = log.cancel
exit_status: str = (
'Actor received KBI (aka an OS-cancel)\n'
...
)
```
The "KBI == OS-cancel" mapping IS the runtime's
deliberate, documented design. An OS-level SIGINT should
flow as: kernel → trio handler → KBI at trio checkpoint
→ unwinds `async_main` → surfaces at `_trio_main`'s
`except KeyboardInterrupt:``log.cancel` + clean `rc=0`.
**So fixing this hang is not "add a new SIGINT behavior" —
it's "make the existing designed behavior actually fire in
this backend config".** That's why option (B) ("fix root
cause") is aligned with existing design intent, not a
scope expansion.
## Evidence
### Positive control: standalone fork-from-worker + `trio.run(sleep_forever)` + SIGINT WORKS
```python
import os, signal, time, trio
from tractor.spawn._subint_forkserver import (
fork_from_worker_thread, wait_child,
)
def child_target() -> int:
async def _main():
try:
await trio.sleep_forever()
except KeyboardInterrupt:
print('CHILD: caught KBI — trio SIGINT works!')
return
trio.run(_main)
return 0
pid = fork_from_worker_thread(child_target, thread_name='trio-sigint-test')
time.sleep(1.0)
os.kill(pid, signal.SIGINT)
wait_child(pid)
```
Result: `CHILD: caught KBI — trio SIGINT works!` + clean
exit. So the fork-child + trio signal plumbing IS healthy
in isolation. The hang appears only with the full tractor
subactor runtime on top.
### Negative test: full tractor subactor + orphan-SIGINT
Equivalent to the xfail test. Traceback dump via
`faulthandler.register(SIGUSR1, all_threads=True)` at the
stuck moment:
```
Current thread 0x00007... [subint-forkserv] (most recent call first):
File ".../trio/_core/_io_epoll.py", line 245 in get_events
File ".../trio/_core/_run.py", line 2415 in run
File "tractor/spawn/_entry.py", line 162 in _trio_main
File "tractor/_child.py", line 72 in _actor_child_main
File "tractor/spawn/_subint_forkserver.py", line 650 in _child_target
File "tractor/spawn/_subint_forkserver.py", line 308 in _worker
File ".../threading.py", line 1024 in run
```
### Thread + signal-mask inventory of the stuck subactor
Single thread (`tid == pid`, comm `'subint-forkserv'`,
which IS `threading.main_thread()` post-fork):
```
SigBlk: 0000000000000000 # nothing blocked
SigIgn: 0000000001001000 # SIGPIPE etc (Python defaults)
SigCgt: 0000000108000202 # bit 1 = SIGINT caught
```
Bit 1 set in `SigCgt` → SIGINT handler IS installed. So
trio's handler IS in place at the kernel level — not a
"handler missing" situation.
### Handler identity
Inside the subactor's RPC body, `signal.getsignal(SIGINT)`
returns `<function KIManager.install.<locals>.handler at
0x...>` — trio's own `KIManager` handler. tractor's only
SIGINT touches are `signal.getsignal()` *reads* (to stash
into `debug.DebugStatus._trio_handler`); nothing writes
over trio's handler outside the debug-REPL shielding path
(`devx/debug/_tty_lock.py::shield_sigint`) which isn't
engaged here (no debug_mode).
## Ruled out
- **GIL starvation / signal-pipe-full** (class A,
`subint_sigint_starvation_issue.md`): subactor runs on
its own GIL (separate OS process), not sharing with the
parent → no cross-process GIL contention. And `strace`-
equivalent in the signal mask shows SIGINT IS caught,
not queued.
- **Orphaned channel park** (`subint_cancel_delivery_hang_issue.md`):
different failure mode — that one has trio iterating
normally and getting wedged on an orphaned
`chan.recv()` AFTER teardown. Here trio's event loop
itself never wakes.
- **Tractor explicitly catching + swallowing KBI**:
greppable — the one `except KeyboardInterrupt:` in the
runtime is the INTENTIONAL cancel-path catch at
`_trio_main:164`. `async_main` uses `except Exception`
(not BaseException), so KBI should propagate through
cleanly if it ever fires.
- **Missing `signal.set_wakeup_fd` (main-thread
restriction)**: post-fork, the fork-worker thread IS
`threading.main_thread()`, so trio's main-thread check
passes and its wakeup-fd install should succeed.
## Root cause hypothesis (unverified)
The SIGINT handler fires but trio's wakeup-fd write does
not wake `epoll_wait`. Candidate causes, ranked by
plausibility:
1. **Wakeup-fd lifecycle race around tractor IPC setup.**
`async_main` spins up an IPC server + `process_messages`
loops early. Somewhere in that path the wakeup-fd that
trio registered with its epoll instance may be
closed/replaced/clobbered, so subsequent SIGINT writes
land on an fd that's no longer in the epoll set.
Evidence needed: compare
`signal.set_wakeup_fd(-1)` return value inside a
post-tractor-bringup RPC body vs. a pre-bringup
equivalent. If they differ, that's it.
2. **Shielded cancel scope around `process_messages`.**
The RPC message loop is likely wrapped in a trio cancel
scope; if that scope is `shield=True` at any outer
layer, KBI scheduled at a checkpoint could be absorbed
by the shield and never bubble out to `_trio_main`.
3. **Pre-fork wakeup-fd inheritance.** trio in the PARENT
process registered a wakeup-fd with its own epoll. The
child inherits the fd number but not the parent's
epoll instance — if tractor/trio re-uses the parent's
stale fd number anywhere, writes would go to a no-op
fd. (This is the least likely — `trio.run()` on the
child calls `KIManager.install` which should install a
fresh wakeup-fd from scratch.)
## Cross-backend scope question
**Untested**: does the same orphan-SIGINT hang reproduce
against the `trio_proc` backend (stock subprocess + exec)?
If yes → pre-existing tractor bug, independent of
`subint_forkserver`. If no → something specific to the
fork-from-worker path (e.g. inherited fds, mid-epoll-setup
interference).
**Quick repro for trio_proc**:
```python
# save as /tmp/trio_proc_orphan_sigint_repro.py
import os, sys, signal, time, glob
import subprocess as sp
SCRIPT = '''
import os, sys, trio, tractor
async def _sleep_forever():
print(f"CHILD_PID={os.getpid()}", flush=True)
await trio.sleep_forever()
async def _main():
async with (
tractor.open_root_actor(registry_addrs=[("127.0.0.1", 12350)]),
tractor.open_nursery() as an,
):
await an.run_in_actor(_sleep_forever, name="sf-child")
print(f"PARENT_READY={os.getpid()}", flush=True)
await trio.sleep_forever()
trio.run(_main)
'''
proc = sp.Popen(
[sys.executable, '-c', SCRIPT],
stdout=sp.PIPE, stderr=sp.STDOUT,
)
# parse CHILD_PID + PARENT_READY off proc.stdout ...
# SIGKILL parent, SIGINT child, poll.
```
If that hangs too, open a broader issue; if not, this is
`subint_forkserver`-specific (likely fd-inheritance-related).
## Why this is ours to fix (not CPython's)
- Signal IS delivered (`SigCgt` bitmask confirms).
- Handler IS installed (trio's `KIManager`).
- Thread identity is correct post-fork.
- `_trio_main` already has the intentional KBI→clean-exit
path waiting to fire.
Every CPython-level precondition is met. Something in
tractor's runtime or trio's integration with it is
breaking the SIGINT→wakeup→event-loop-wake pipeline.
## Possible fix directions
1. **Audit the wakeup-fd across tractor's IPC bringup.**
Add a trio startup hook that captures
`signal.set_wakeup_fd(-1)` at `_trio_main` entry,
after `async_main` enters, and periodically — assert
it's unchanged. If it moves, track down the writer.
2. **Explicit `signal.set_wakeup_fd` reset after IPC
setup.** Brute force: re-install a fresh wakeup-fd
mid-bringup. Band-aid, but fast to try.
3. **Ensure no `shield=True` cancel scope envelopes the
RPC-message-loop / IPC-server task.** If one does,
KBI-at-checkpoint never escapes.
4. **Once fixed, the `child_sigint='trio'` mode on
`subint_forkserver_proc`** becomes effectively a no-op
or a doc-only mode — trio's natural handler already
does the right thing. Might end up removing the flag
entirely if there's no behavioral difference between
modes.
## Current workaround
None; `child_sigint` defaults to `'ipc'` (IPC cancel is
the only reliable cancel path today), and the xfail test
documents the gap. Operators hitting orphan-SIGINT get a
hung process that needs `SIGKILL`.
## Reproducer
Inline, standalone (no pytest):
```python
# save as /tmp/orphan_sigint_repro.py (py3.14+)
import os, sys, signal, time, glob, trio
import tractor
from tractor.spawn._subint_forkserver import (
fork_from_worker_thread,
)
async def _sleep_forever():
print(f'SUBACTOR[{os.getpid()}]', flush=True)
await trio.sleep_forever()
async def _main():
async with (
tractor.open_root_actor(
registry_addrs=[('127.0.0.1', 12349)],
),
tractor.open_nursery() as an,
):
await an.run_in_actor(_sleep_forever, name='sf-child')
await trio.sleep_forever()
def child_target() -> int:
from tractor.spawn._spawn import try_set_start_method
try_set_start_method('subint_forkserver')
trio.run(_main)
return 0
pid = fork_from_worker_thread(child_target, thread_name='repro')
time.sleep(3.0)
# find the subactor pid via /proc
children = []
for path in glob.glob(f'/proc/{pid}/task/*/children'):
with open(path) as f:
children.extend(int(x) for x in f.read().split() if x)
subactor_pid = children[0]
# SIGKILL root → orphan the subactor
os.kill(pid, signal.SIGKILL)
os.waitpid(pid, 0)
time.sleep(0.3)
# SIGINT the orphan — should cause clean trio exit
os.kill(subactor_pid, signal.SIGINT)
# poll for exit
for _ in range(100):
try:
os.kill(subactor_pid, 0)
time.sleep(0.1)
except ProcessLookupError:
print('HARNESS: subactor exited cleanly ✔')
sys.exit(0)
os.kill(subactor_pid, signal.SIGKILL)
print('HARNESS: subactor hung — reproduced')
sys.exit(1)
```
Expected (current): `HARNESS: subactor hung — reproduced`.
After fix: `HARNESS: subactor exited cleanly ✔`.
## References
- `tractor/spawn/_entry.py::_trio_main:157-186` — the
intentional KBI→clean-exit path this bug makes
unreachable.
- `tractor/spawn/_subint_forkserver` — the backend whose
orphan cancel-robustness this blocks.
- `tests/spawn/test_subint_forkserver.py::test_orphaned_subactor_sigint_cleanup_DRAFT`
— the xfail'd reproducer in the test suite.
- `ai/conc-anal/subint_cancel_delivery_hang_issue.md`
sibling "orphaned channel park" hang (different class).
- `ai/conc-anal/subint_sigint_starvation_issue.md`
sibling "GIL starvation SIGINT drop" hang (different
class).
- tractor issue #379 — subint backend tracking.

View File

@ -43,6 +43,7 @@ Gating
from __future__ import annotations
from functools import partial
import os
from pathlib import Path
import platform
import select
import signal
@ -179,7 +180,8 @@ async def run_fork_in_non_trio_thread(
# inner `trio.fail_after()` so assertion failures fire fast
# under normal conditions.
@pytest.mark.timeout(30, method='thread')
def test_fork_from_worker_thread_via_trio() -> None:
def test_fork_from_worker_thread_via_trio(
) -> None:
'''
Baseline: inside `trio.run()`, call
`fork_from_worker_thread()` via `trio.to_thread.run_sync()`,
@ -447,19 +449,26 @@ def _process_alive(pid: int) -> bool:
@pytest.mark.xfail(
strict=True,
reason=(
'subint_forkserver orphan-child SIGINT gap: trio on the '
'fork-inherited non-main thread has no SIGINT wakeup-fd '
'handler installed, and CPython\'s default '
'KeyboardInterrupt delivery does NOT reach the trio '
'loop on this thread post-fork. Fix path TBD — see '
'TODO in `tractor.spawn._subint_forkserver`. Flip this '
'mark (or drop it) once the gap is closed.'
'subint_forkserver orphan-child SIGINT hang: trio\'s '
'event loop stays wedged in `epoll_wait` despite the '
'SIGINT handler being correctly installed and the '
'signal being delivered at the kernel level. NOT a '
'"handler missing on non-main thread" issue — post-'
'fork the worker IS `threading.main_thread()` and '
'trio\'s `KIManager` handler is confirmed installed. '
'Full analysis + ruled-out hypotheses + fix directions '
'in `ai/conc-anal/'
'subint_forkserver_orphan_sigint_hang_issue.md`. '
'Flip this mark (or drop it) once the gap is closed.'
),
)
@pytest.mark.timeout(60, method='thread')
@pytest.mark.timeout(
30,
method='thread',
)
def test_orphaned_subactor_sigint_cleanup_DRAFT(
reg_addr: tuple[str, int | str],
tmp_path,
tmp_path: Path,
) -> None:
'''
DRAFT orphaned-subactor SIGINT survivability under the
@ -478,15 +487,23 @@ def test_orphaned_subactor_sigint_cleanup_DRAFT(
5. Poll `os.kill(child_pid, 0)` for up to 10s assert
the child exits.
Empirical result (2026-04): currently **FAILS** the
"post-fork single-thread → default KeyboardInterrupt lands
on trio's thread" hypothesis from the class-A/B
conc-anal discussion turned out to be wrong. SIGINT on the
orphan child doesn't unwind the trio loop. Most likely
CPython delivers `KeyboardInterrupt` specifically to
`threading.main_thread()`, whose tstate is dead in the
post-fork child (fork inherited the worker thread, not the
original main thread). Marked `xfail(strict=True)` so the
Empirical result (2026-04, py3.14): currently **FAILS**
SIGINT on the orphan child doesn't unwind the trio loop,
despite trio's `KIManager` handler being correctly
installed in the subactor (the post-fork thread IS
`threading.main_thread()` on py3.14). `faulthandler` dump
shows the subactor wedged in `trio/_core/_io_epoll.py::
get_events` the signal's supposed wakeup of the event
loop isn't firing. Full analysis + diagnostic evidence
in `ai/conc-anal/
subint_forkserver_orphan_sigint_hang_issue.md`.
The runtime's *intentional* "KBI-as-OS-cancel" path at
`tractor/spawn/_entry.py::_trio_main:164` is therefore
unreachable under this backend+config. Closing the gap is
aligned with existing design intent (make the already-
designed behavior actually fire), not a new feature.
Marked `xfail(strict=True)` so the
mark flips to XPASSfail once the gap is closed and we'll
know to drop the mark.
@ -552,7 +569,8 @@ def test_orphaned_subactor_sigint_cleanup_DRAFT(
# step 4+5: SIGINT the orphan, poll for exit.
os.kill(child_pid, signal.SIGINT)
cleanup_deadline: float = time.monotonic() + 10.0
timeout: float = 6.0
cleanup_deadline: float = time.monotonic() + timeout
while time.monotonic() < cleanup_deadline:
if not _process_alive(child_pid):
return # <- success path

View File

@ -69,6 +69,20 @@ from ._exceptions import (
logger = log.get_logger('tractor')
# Spawn backends under which `debug_mode=True` is supported.
# Requirement: the spawned subactor's root runtime must be
# trio-native so `tractor.devx.debug._tty_lock` works. Matches
# both the enable-site in `open_root_actor` and the cleanup-
# site reset of `_runtime_vars['_debug_mode']` — keep them in
# lockstep when adding backends.
_DEBUG_COMPATIBLE_BACKENDS: tuple[str, ...] = (
'trio',
# forkserver children run `_trio_main` in their own OS
# process — same child-side runtime shape as `trio_proc`.
'subint_forkserver',
)
# TODO: stick this in a `@acm` defined in `devx.debug`?
# -[ ] also maybe consider making this a `wrapt`-deco to
# save an indent level?
@ -293,10 +307,14 @@ async def open_root_actor(
)
loglevel: str = loglevel.upper()
# Debug-mode is currently only supported for backends whose
# subactor root runtime is trio-native (so `tractor.devx.
# debug._tty_lock` works). See `_DEBUG_COMPATIBLE_BACKENDS`
# module-const for the list.
if (
debug_mode
and
_spawn._spawn_method == 'trio'
_spawn._spawn_method in _DEBUG_COMPATIBLE_BACKENDS
):
_state._runtime_vars['_debug_mode'] = True
@ -318,7 +336,9 @@ async def open_root_actor(
elif debug_mode:
raise RuntimeError(
"Debug mode is only supported for the `trio` backend!"
f'Debug mode currently supported only for '
f'{_DEBUG_COMPATIBLE_BACKENDS!r} spawn backends, not '
f'{_spawn._spawn_method!r}.'
)
assert loglevel
@ -619,7 +639,7 @@ async def open_root_actor(
if (
debug_mode
and
_spawn._spawn_method == 'trio'
_spawn._spawn_method in _DEBUG_COMPATIBLE_BACKENDS
):
_state._runtime_vars['_debug_mode'] = False

View File

@ -870,7 +870,14 @@ class Actor:
accept_addrs: list[UnwrappedAddress]|None = None
if self._spawn_method in ("trio", "subint"):
if self._spawn_method in (
'trio',
'subint',
# `subint_forkserver` parent-side sends a
# `SpawnSpec` over IPC just like the other two
# — fork child-side runtime is trio-native.
'subint_forkserver',
):
# Receive post-spawn runtime state from our parent.
spawnspec: msgtypes.SpawnSpec = await chan.recv()

View File

@ -67,12 +67,15 @@ Still-open work (tracked on tractor #379):
to `tests/test_subint_cancellation.py` for the plain
`subint` backend),
- `child_sigint='trio'` mode (flag scaffolded below; default
is `'ipc'`): install a manual SIGINT trio-cancel bridge
in the fork-child prelude so externally-delivered SIGINT
reaches the child's trio loop even when the parent is
dead (no IPC cancel path). See the xfail'd
`test_orphaned_subactor_sigint_cleanup_DRAFT` for the
target behavior.
is `'ipc'`). Originally intended as a manual SIGINT
trio-cancel bridge, but investigation showed trio's handler
IS already correctly installed in the fork-child subactor
the orphan-SIGINT hang is actually a separate bug where
trio's event loop stays wedged in `epoll_wait` despite
delivery. See
`ai/conc-anal/subint_forkserver_orphan_sigint_hang_issue.md`
for the full trace + fix directions. Once that root cause
is fixed, this flag may end up a no-op / doc-only mode.
- child-side "subint-hosted root runtime" mode (the second
half of the envisioned arch currently the forked child
runs plain `_trio_main` via `spawn_method='trio'`; the
@ -588,11 +591,11 @@ async def subint_forkserver_proc(
)
if child_sigint == 'trio':
raise NotImplementedError(
f"`child_sigint='trio'` mode — trio-native SIGINT "
f"plumbing in the fork-child — is scaffolded but "
f"not yet implemented. See the xfail'd "
f"`test_orphaned_subactor_sigint_cleanup_DRAFT` "
f"and the TODO in this module's docstring."
"`child_sigint='trio'` mode — trio-native SIGINT "
"plumbing in the fork-child — is scaffolded but "
"not yet implemented. See the xfail'd "
"`test_orphaned_subactor_sigint_cleanup_DRAFT` "
"and the TODO in this module's docstring."
)
uid: tuple[str, str] = subactor.aid.uid
@ -652,12 +655,16 @@ async def subint_forkserver_proc(
loglevel=loglevel,
parent_addr=parent_addr,
infect_asyncio=infect_asyncio,
# NOTE, from the child-side runtime's POV it's
# a regular trio actor — it uses `_trio_main`,
# receives `SpawnSpec` over IPC, etc. The
# `subint_forkserver` name is a property of HOW
# the parent spawned, not of what the child is.
spawn_method='trio',
# The child's runtime is trio-native (uses
# `_trio_main` + receives `SpawnSpec` over IPC),
# but label it with the actual parent-side spawn
# mechanism so `Actor.pformat()` / log lines
# reflect reality. Downstream runtime gates that
# key on `_spawn_method` group `subint_forkserver`
# alongside `trio`/`subint` where the SpawnSpec
# IPC handshake is concerned — see
# `runtime._runtime.Actor._from_parent()`.
spawn_method='subint_forkserver',
)
return 0