Commit Graph

690 Commits (ab1985046938ba35f5ea8b1e6ca026257e9c99e6)

Author SHA1 Message Date
Gud Boi d327d2b35e Harden `test_infected_asyncio` for fork spawners
Deats,
- `test_echoserver_detailed_mechanics`: add `is_forking_spawner`
  param, wrap `main()` in `fa_main()` with per-backend
  `trio.fail_after` (4s fork / 1s trio) to cap cancel-cascade
  teardown that compounds under forkserver.
- `test_sigint_closes_lifetime_stack`: swap `start_method` param
  for `is_forking_spawner`, pre-init `tmp_file`/`ctx` to `None` so
  KBI firing before `open_context` body doesn't `UnboundLocalError`,
  add `pytest.fail` guard for the spawn-time IPC race case, arm
  `signal.alarm` AFK-safety cap (10s) under fork backends

Also,
- `pytestmark`: add `track_orphaned_uds_per_test` +
  `detect_runaway_subactors_per_test` fixtures.
- `delay()`: hardcode `return 1e3` at top (debug override still in
  place).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 7ee0dc2e8f)
2026-06-09 20:28:05 -04:00
Gud Boi 712c0ff794 Adjust `test_streaming_to_actor_cluster` timeout
For forking spawner backends that is.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b10011a36e)
2026-06-09 20:28:05 -04:00
Gud Boi 22ac8ca17f Enrich `pytestmark` in `test_inter_peer_cancellation`
- `skipon_spawn_backend('subint')`: expand reason with specific
  analysis doc refs + GH issue #379 umbrella link.
- add `track_orphaned_uds_per_test` fixture via `usefixtures` to
  blame-attribute UDS sock-file orphans left by SIGKILL cancel
  cascades.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 7d0a53d205)
2026-06-09 20:28:05 -04:00
Gud Boi 981c174b80 Adjust `test_simple_context` timeout for forking spawner
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 75d5b4cf7b)
2026-06-09 20:28:05 -04:00
Gud Boi cbf54db96a Add `set_fork_aware_capture`, timeout to msg tests
- `test_ext_types_over_ipc`: wrap `main()` in `fa_main()` with
  `trio.fail_after(2)` + commented `capfd.disabled()` investigation
  (pytest#14444).
- `test_basic_payload_spec`: add fixture param with note on fork-spawner
  hang prevention.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 8aa07a7932)
2026-06-09 20:28:05 -04:00
Gud Boi 17f725ba6f Add signal-alarm guard to `test_dynamic_pub_sub`
Outer `signal.alarm` cap that fires even when trio's
`fail_after` is blocked by a shielded-await deadlock
(the bug-class-3 hang under MTF backends). Only armed
for fork-based spawners where the bug lives.

Deats,
- `_DIAG_CAP_S = fail_after_s + 5` — slightly larger than the
  trio-native guard so it always loses when the in-band path works.
- `test_log.cancel()` breadcrumbs at each cancel-scope boundary so the
  last-fired breadcrumb names the swallow point on hang.
- try/finally wrapping around each scope level for deterministic
  breadcrumb emission.
- add `is_forking_spawner`, `set_fork_aware_capture` fixture params.
- rework `fail_after_s`: 4s for fork, 12s for trio (was 30/12).

Also,
- `test_sigint_both_stream_types`: `assert 0` -> `pytest.fail()`, add
  TODO re `pytest.raises()`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 10db117864)
2026-06-09 20:28:05 -04:00
Gud Boi 30591e79c2 Add boot-race conc-anal, widen `xfail` to `n_dups=8`
New `ai/conc-anal/spawn_time_boot_death_dup_name_issue.md`
documenting the spawn-time rc=2 race under rapid
same-name spawning against a forkserver + registrar
— the `wait_for_peer_or_proc_death` helper now surfaces
the death instead of parking forever on the handshake
wait.

Also,
- extract inline `xfail` into module-level
  `_DOGGY_BOOT_RACE_XFAIL` marker.
- apply it to `n_dups=8` too (previously bare) bc
  larger N widens the race window enough to fire
  occasionally.
- link to tracking issue #456.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 92443dc4ef)
2026-06-09 20:28:05 -04:00
Gud Boi fd886c0cef Adjust legacy streaming test timeouts for fork+UDS
Forking spawner + UDS transport has different timing
vs `trio_proc` — streaming example completes faster
in some cases, slower in others depending on fork
overhead + sock setup.

Deats,
- add `expect_cancel` param to `cancel_after()`, raise
  `ActorTooSlowError` when cancel scope fires unexpectedly instead of
  silently returning `None`.
- `time_quad_ex` fixture: bump timeout +1 for forking+UDS, explicit
  `ActorTooSlowError` on `None` result instead of bare `assert results`.
- `test_not_fast_enough_quad`: `xfail` for forking+UDS being "too fast"
  (cancel doesn't fire bc streaming finishes before delay).
- add `is_forking_spawner`, `tpt_proto` fixture params throughout.

Also,
- `_testing/pytest.py`: widen `start_method` parametrize and
  `is_forking_spawner` fixture to `scope='session'`.
- `"""` -> `'''` docstring style throughout.
- hoist `_non_linux` to module scope (was redefined locally in two
  places).
- type hints, kwarg-style `partial()` calls.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit d3cbc92751)
2026-06-09 20:28:05 -04:00
Gud Boi 81306f40dc Harden `test_registrar` with reap fixtures, timeouts
Add module-level `pytestmark` applying per-test
`reap_subactors_per_test`, `track_orphaned_uds_per_test`, and
`detect_runaway_subactors_per_test` fixtures — registrar tests stress
discovery roundtrips that historically left orphaned UDS sock-files.

Deats,
- drop unused `say_hello()` fn, keep only `say_hello_use_wait`;
  rename param `func` -> `ria_fn`.
- use `@tractor_test(timeout=7)` instead of separate
  `@pytest.mark.timeout(7, method='thread')` decorator.
- add `with_timeout()` helper, wire into
  `test_subactors_unregister_on_cancel_remote_daemon`.
- uncomment `_timeout_main()` in `test_stale_entry_is_deleted`, use
  configurable `timeout` var + `debug_mode` guard for `tractor.pause()`
  on cancel.
- `dump_on_hang(seconds=timeout*2)` instead of hardcoded `20`.
- fix typo "oustanding" -> "outstanding".

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit abd3950ba6)
2026-06-09 20:28:05 -04:00
Gud Boi 5b699be6e6 Add per-actor `setproctitle` via `devx._proctitle`
New `tractor.devx._proctitle` mod sets each
sub-actor's `argv[0]` (and kernel `comm`) to
`tractor[<aid.reprol()>]` — e.g.
`tractor[doggy@1027301b]` — so `ps`/`top`/`htop`
and `acli.pytree`/reaper tooling can identify
actors at a glance without parsing full cmdlines.

Deats,
- `set_actor_proctitle()` wraps the `setproctitle`
  pkg with `ImportError` guard; optional at runtime
  but listed in `pyproject.toml` so default installs
  benefit.
- called early in `_child._actor_child_main()` after
  `Actor` construction, before `_trio_main()` entry.
- tests in `tests/devx/test_proctitle.py`: format
  unit test, `/proc/{cmdline,comm}` integration
  test, negative detection test.

Resolves #457

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit d60245777e)
2026-06-09 20:28:05 -04:00
Gud Boi 15ecc79e37 Add dup-name cancel-cascade escalation test
Extend `test_register_duplicate_name` w/ cancel-level log
breadcrumbs and `try/finally` for better diag on the cancel-cascade
hang.

Add `test_dup_name_cancel_cascade_escalates_to_hard_kill` as a
regression test for the TCP+MTF duplicate-name cancel-cascade
deadlock. Spawns N same-name actors, calls `an.cancel()`, and
asserts teardown completes within a `trio.fail_after()` budget that
scales w/ `n_dups`.

Deats,
- parametrize `n_dups` (2, 4, 8) to widen the race window for
  concurrent `register_actor` RPCs.
- `n_dups=4` xfail'd — exposes a separate boot-race bug (doggy
  `rc=2` under rapid same-name spawn), tracked in #456.
- post-teardown asserts all `Portal` chans disconnect, verifying
  hard-kill escalation worked.

Relates to https://github.com/goodboy/tractor/issues/456

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit caebf60f4e)
2026-06-09 20:28:05 -04:00
Gud Boi 5b1dc1deb7 Add `enable_transports`/`registry_addrs` proto guard
Raise `ValueError` from `open_root_actor()` when any
`registry_addrs` entry uses a transport proto not in
`enable_transports` — historically this caused a
silent indefinite hang during the registrar handshake
(the actor could never connect to register/discover).

Also,
- update `test_root_passes_tpt_to_sub` to detect a
  proto mismatch between parametrized `tpt_proto_key`
  and CLI `tpt_proto`, asserting the new guard raises
  `ValueError` with expected msg content.
- replace old commented-out notes with a clearer
  explanation of the mismatch foot-gun.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit d036ef7d7f)
2026-06-09 20:28:04 -04:00
Gud Boi e7b577bbf6 Mv `daemon` + `test_multi_program` to `discovery/`
All `daemon` fixture consumers are discovery-
protocol tests now living under `tests/discovery/`.
Move the fixture, its `_wait_for_daemon_ready`
helper, and `test_multi_program.py` into that subdir
so scope matches usage.

Also,
- add `pytestmark` for `track_orphaned_uds_per_test`
  + `detect_runaway_subactors_per_test` to `test_multi_program` as
    regression net.
- drop now-unused `_PROC_SPAWN_WAIT` + `socket` import from root
  conftest.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit c4082be876)
2026-06-09 20:28:04 -04:00
Gud Boi 5a9ae54064 Replace sleep with active poll in `daemon` fixture
First draft at resolving,
https://github.com/goodboy/tractor/issues/424

`tests.conftest.py.daemon()` previously used a blind
`time.sleep(_PROC_SPAWN_WAIT + uds_bonus + ci_bonus)` to "wait for the
daemon to come up" before yielding the proc to the test.

Two problems:

1. **Racy under load** — sleep is fixed at design time; loaded boxes
   / cold starts / fork-spawn cost spikes blow past it, leading to
   `ConnectionRefusedError` /`OSError: connect failed` flakes in
   `test_register_duplicate_name`.

2. **Wasteful when daemon comes up fast** — happy-path pays the FULL
   sleep regardless. ~3s of dead time per fixture invocation, ~10-20s
   per full suite run.

Replace with `_wait_for_daemon_ready()` — active poll via stdlib
`socket.create_connection` (TCP) or `socket.connect` (UDS) on the
daemon's bind addr, with 50ms backoff and a 10s/15s deadline (CI gets
extra headroom). Daemon-died-during-startup early-exit catches the case
where `_PROC_SPAWN_WAIT` was silently masking daemon startup crashes.

Why stdlib `socket` (Option 2 from the conc-anal doc) instead of
`tractor`'s own `_root.ping_tpt_socket` closure or trio?

- `tractor.run_daemon()` doesn't return from bootstrap until the runtime
  is fully ready to handle IPC, so probing listen-side acceptance is
  sufficient.
- no need to do the full IPC handshake just to validate readiness.
  Sidesteps the `trio.run()` bootstrap cost (~50ms) per fixture too.

`claude`'s verification: 10/10 runs of `tests/test_multi_program.py`
pass on both `--tpt-proto=tcp` and `--tpt-proto=uds`. Per-test wall-time
`test_register_duplicate_name`: 4.31s → 1.10s. Full file: ~12s → 3.27s
per transport.

Doc-tracked at:
`ai/conc-anal/test_register_duplicate_name_daemon_connect_race_issue.md`

Future work — session-scoped trio runtime in a bg thread to share
fixture-side trio operations across many fixtures (currently overkill
for the one fixture that needs it).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit ec8c4659c4)
2026-06-09 20:28:04 -04:00
Gud Boi bdde1afaae Harden `test_debugger` for forkserver spawners
Use `is_forking_spawner` fixture + gate spawner-
specific expect patterns in nested-error and daemon
tests. Add `set_fork_aware_capture` to multi-sub
tests that need capture-mode awareness.

Deats,
- replace `start_method` param with `is_forking_spawner` bool fixture.
- bump inter-send delay to 0.1s for IPC stability under fork backends.
- gate `bdb.BdbQuit` + relay-uid patterns behind `not
  is_forking_spawner` (not visible under capsys).
- add `expect(child, EOF)` to confirm clean exit.
- switch caught exc from `AssertionError` to `ValueError` in daemon
  test.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 9031605807)
2026-06-09 20:28:04 -04:00
Gud Boi 45c917c452 Drop global mutation of `_PROC_SPAWN_WAIT`
In top level `daemon`-fixture that is..

Use a local `bg_daemon_spawn_delay` instead of
mutating the module-level `_PROC_SPAWN_WAIT` —
previously each `daemon` fixture invocation would
permanently add 1.6s (UDS) or 1s (CI) to the
global, inflating delays across the session.

Also, emit a `test_log.warning()` when verbose
loglevel is silently reduced to `'info'`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit c4885f9d99)
2026-06-09 20:28:04 -04:00
Gud Boi 4f042ded23 Add `tractor.trionics.patches` subpkg + first fix
With a seminal patch fixing `trio`'s `WakeupSocketpair.drain()` which
can busy-loop due to lack of handling `EOF`.

New `tractor.trionics.patches` subpkg housing defensive monkey-patches
for upstream `trio` bugs we've encountered while running `tractor`
— particularly as of recent, fork-survival edge cases that haven't been
filed/fixed upstream yet. Each patch is idempotent, version-gated via
`is_needed()`, and carries a `# REMOVE WHEN:` marker pointing at the
upstream release whose adoption allows deletion.

Subpkg layout + per-patch contract documented in
`tractor/trionics/patches/README.md` — `apply()` / `is_needed()`
/ `repro()` API, registry pattern via `_PATCHES` in `__init__.py`,
single-call entry point `apply_all()`.

First patch, `_wakeup_socketpair`:
- `trio`'s `WakeupSocketpair.drain()` loops on `recv(64KB)` and exits
  ONLY on `BlockingIOError`, NEVER on `recv() == b''` (peer-closed FIN).
- under `fork()`-spawning backends the COW-inherited socketpair fds
  & `_close_inherited_fds()` teardown can leave a `WakeupSocketpair`
  instance whose write-end is closed, and `drain()` then **spins forever
  in C with no Python checkpoints**,
- this obviously burns 100% CPU and no signal delivery.

Standalone repro:

    from trio._core._wakeup_socketpair import WakeupSocketpair
    ws = WakeupSocketpair()
    ws.write_sock.close()
    ws.drain()  # spins forever

Patch is one-line — break the drain loop on b'' EOF.

Manifested as two distinct test failures:

- `tests/test_multi_program.py::test_register_duplicate_name` hung at
  100% CPU on the busy-loop directly (fork child's worker thread)
- `tests/test_infected_asyncio.py::test_aio_simple_error` Mode-A
  deadlock — busy-loop wedged trio's scheduler inside `start_guest_run`,
  both threads parked in `epoll_wait`, no TCP connect-back to parent
  ever happened.

Same patch fixes both. Restored 99.7% pass rate on full
suite under `--spawn-backend=main_thread_forkserver`
(was hanging indefinitely before).

Wired into `tractor._child._actor_child_main` via `apply_all()` BEFORE
any trio runtime init. Harmless on non-fork backends.

Conc-anal write-ups, including strace + py-spy evidence:

- `ai/conc-anal/trio_wakeup_socketpair_busy_loop_under_fork_issue.md`
- `ai/conc-anal/infected_asyncio_under_main_thread_forkserver_hang_issue.md`

Regression tests in `tests/trionics/test_patches.py`: each test asserts
(a) the bug exists pre-patch (or is fixed upstream — skip cleanly), (b)
the patch fixes it with a SIGALRM wall-clock cap so a regression hangs
loud instead of silently.

TODO:
- [ ] file the upstream `python-trio/trio` issue + PR.
- [ ] use the `repro()` callable in `_wakeup_socketpair.py` IS the issue
      body's evidence section.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 0ef549fadb)
(factored: dropped spawn-backend-only paths: ai/conc-anal/infected_asyncio_under_main_thread_forkserver_hang_issue.md)
2026-06-09 20:28:04 -04:00
Gud Boi f7c048e535 Adjust `test_shield_pause` for capsys backends
Under `main_thread_forkserver` the bootstrapping
hook switches to `--capture=sys`, so subactor
fd-level output (tree dumps, zombie-reaper msgs)
isn't captured per-test by pexpect. Gate those
expects behind a `no_capfd` check so the test
passes on both capture modes.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 5a9926fc32)
2026-06-09 20:28:04 -04:00
Gud Boi 4d8e67bd7f Default `--ll` to `None` in test harness
Only override `tractor.log._default_loglevel` when
the flag is explicitly passed — lets per-spawn and
per-example `loglevel` kwargs take effect instead
of being clobbered by the hard-coded `'ERROR'`
default.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 72a0465c52)
2026-06-09 20:28:04 -04:00
Gud Boi f18cb0e033 Update debug examples + harden `test_debugger`
Pass explicit `loglevel` to `spawn()` calls in
`test_debugger` tests — required for pexpect
pattern matching now that examples no longer
hard-code log levels.

Also,
- make `expect()` return the decoded `before` str.
- add `start_method` param + fork-backend timeout
  slack (+4s) in nested-error test.
- clean up debug examples: drop unused loglevels,
  rename `n` -> `an`, fix docstrings, add TODO
  comments for tpt parametrize via osenv.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 9431a81d37)
2026-06-09 20:28:04 -04:00
Gud Boi 49a397d6d9 Update `sync_bp` + tighten `test_pause_from_sync`
Add `disable_pdbp_color()` to the `sync_bp` example
to suppress pygments prompt coloring when
`PYTHON_COLORS=0` — makes pexpect pattern matching
deterministic.

Deats,
- set `loglevel='pdb'` in both script + test spawn.
- disable `enable_stack_on_sig` in example, assert
  no `stackscope` output in test.
- update `attach_patts` keys/values with `|_<Task`
  / `|_<Thread` / `|_('subactor'` prefixes to match
  actual tree-dump format.
- add call-site patterns (`tractor.pause_from_sync()`
  `tractor.pause()`, `breakpoint(hide_tb=...)`).
- trim trailing `\n` from `Lock.repr()` output.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit fc2e298a29)
2026-06-09 20:28:04 -04:00
Gud Boi 0df90500fa Fix `SIGUSR1` tree-dump ordering in `_stackscope`
Factor the sub-actor relay loop out of
`dump_tree_on_sig()` into `_relay_sig_to_subactors()`
and chain both dump + relay in a single
`run_sync_soon` callback (`_dump_then_relay`) so the
parent's task-tree flushes BEFORE any sub receives
the signal — fixes a hierarchical-ordering race
where subs could dump ahead of the parent in the
muxed pty stream.

Also,
- gate file/tty sink writes behind `write_file` +
  `write_tty` params on `dump_task_tree()`.
- use `actor.aid.uid` instead of deprecated `.uid`.
- update `test_shield_pause` expects to match the
  new sequential parent -> relay-log -> sub ordering.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit e2b790a70d)
2026-06-09 20:28:04 -04:00
Gud Boi 363d11b89c Add `pytest_load_initial_conftests()` for `--capture=`
Move `--capture=sys` enforcement from a static ini
flag to a `pytest_load_initial_conftests()` bootstrap
hook that dynamically flips capture mode only when a
fork-based spawner (like `main_thread_forkserver`) is
detected; non-fork backends keep `--capture=fd`.

Also,
- load `tractor._testing.pytest` via `-p` in ini
  (bc bootstrapping hooks must register before
  conftest `pytest_plugins` runs).
- register `_reap` as sub-plugin via `pytest_plugins`
  tuple in `._testing.pytest`.
- drop now-duplicate reap fixtures (already in `_reap`
  per 1cdc7fb3).
- rename `tractor_enable_stackscope` dest -> `enable_stackscope`
  and pop env var on disable.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 61d4525137)
2026-06-09 20:28:04 -04:00
Gud Boi d5b10b9e0c Allow per-call `start_method`/`loglevel` overrides
In `tests/devx/conftest.py::spawn`, refactor the
fixture-internal closures so consumer tests can pass
explicit `start_method`/`loglevel` to each `_spawn()`
invocation rather than only inheriting the fixture-
scoped parametrize values.

Deats,
- promote `set_spawn_method()` and `set_loglevel()`
  to take their respective values as fn params (vs
  closing over the fixture-scope vars).
- give `_spawn()` `start_method=start_method` and
  `loglevel: str|None = None` kwargs so callers
  override one-off without re-parametrizing the
  suite. NOTE: this drops the implicit fixture-
  scoped `loglevel` forward — `_spawn()` callers
  now must pass `loglevel=...` explicitly.
- TODO: figure out how `--ll <level>` should map to
  the default (currently `None` → uses env-var or
  tractor default).
- add a docstring to `_spawn()` so its role as the
  consumer-facing closure is obvious from `help()`.

Also,
- `assert_before()` now returns the `.before` output
  on success (was `None`); add a one-line docstring
  describing the new return contract.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 486249d74f)
2026-06-09 20:27:26 -04:00
Gud Boi 6835391c22 Drop test-local timeouts, +`sync_pause` to dev
In `pyproject.toml`,
- include the `sync_pause` group from `dev`, so dev
  installs ship `greenback` for `pause_from_sync()`.

Comment out per-test `@pytest.mark.timeout(...)`
markers in,
- `tests/devx/test_debugger.py`
- `tests/discovery/test_registrar.py`
- `tests/spawn/test_main_thread_forkserver.py`
- `tests/spawn/test_subint_cancellation.py`
- `tests/test_advanced_streaming.py`
- `tests/test_cancellation.py`

The global cap was already dropped (3c366cac); these
were the leftover per-test caps which now block
interactive `pdb` flows under the new spawn backends.

In `uv.lock`,
- pull `greenback` into the resolved `dev` deps
  (per the `sync_pause` include above).
- catch up the prior `xonsh` editable→PyPI switch
  (from the `pyproject.toml` `tool.uv.sources` edit).

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b7115fc875)
(factored: dropped spawn-backend-only paths under tests/spawn/)
2026-06-09 20:27:26 -04:00
Gud Boi fb409055bf Honor `TRACTOR_LOGLEVEL`+`TRACTOR_SPAWN_METHOD` env-vars
Add env-var overrides inside `._root.open_root_actor()` so
devs/test-runs can swap the actor-spawn backend or crank
console verbosity *without* touching application code.

In `._root.open_root_actor()`,
- read `TRACTOR_LOGLEVEL` early, overriding any caller-passed
  `loglevel` and stashing an `env_ll_report` to emit once the
  console log is set up.
- pull the `loglevel` fallback (`or _default_loglevel`) and
  `log.get_console_log()` init *up* so the env-var report
  routes through tractor's own logger.
- read `TRACTOR_SPAWN_METHOD`, overriding any caller-passed
  `start_method` and warn-logging when the env-var clobbers
  an explicit caller value.

Wire the same vars through `tests/devx/conftest.py::spawn`,
- request the `loglevel` fixture, set both `TRACTOR_LOGLEVEL`
  and `TRACTOR_SPAWN_METHOD` in `os.environ` before each
  `pexpect.spawn()` (inherited by the example subproc).
- expand `supported_spawners` to include
  `main_thread_forkserver` and `subint_forkserver` bc
  example scripts no longer need per-script CLI plumbing.
- pop both vars in fixture teardown so a leaked value can't
  re-route a later in-process tractor test's spawn-backend
  or loglevel.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 208e7c0926)
2026-06-09 20:27:26 -04:00
Gud Boi d4e4062bbd Add todo for running `test_debugger` suite on forkserver spawner
(cherry picked from commit 2917b74ba4)
2026-06-09 20:27:26 -04:00
Gud Boi ff7acfcbd6 Backend-aware `fail_after` in pub/sub test
Mirror `060f7d24`'s pattern (backend-aware timeout in
`maybe_expect_raises`) for `test_dynamic_pub_sub`'s hard
`trio.fail_after` cap. Fork-based backends pay per-spawn
fork+IPC-handshake cost which stacks over `cpus - 1`
sequential `n.run_in_actor()` calls; empirically 12s
flakes on `main_thread_forkserver` under UDS
cross-pytest contention (#451 / #452).

Defaults:
- `main_thread_forkserver` → 30s
- everything else          → 12s (unchanged)

Hoist the timeout-pick out of the `main()` closure so the
dispatch happens once in the trio task rather than
re-evaluating per spawn.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 383b0fdd75)
2026-06-09 20:27:26 -04:00
Gud Boi 6f003d7efd Backend-aware timeout in `maybe_expect_raises`
Default `timeout` from `int = 3` → `int|None = None`;
when unset, pick a backend-aware value. Fork-based
backends (`main_thread_forkserver`) need real headroom
bc actor spawn + IPC ctx-exit + msg-validation error
path is much heavier than under `trio` backend —
especially under cross-pytest-stream contention (#451).

Defaults:
- `main_thread_forkserver` → 30s
- everything else          → 3s (unchanged)

Empirical flake history that motivated 30s as the floor
on fork backends (all from `test_basic_payload_spec`):

- 3s  → all-valid variant flaked w/ `TooSlowError`
- 8s  → `invalid-return` variant flaked w/ `Cancelled`
        (surfaced instead of `MsgTypeError` bc the
        outer `fail_after` fired mid-error-path)
- 15s → flaked under cross-pytest-stream contention

30s gives plenty of headroom while still failing-loud
on a genuine hang. Callers can opt out by passing an
explicit `timeout=` kw.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 060f7d24c4)
2026-06-09 20:27:26 -04:00
Gud Boi 2d9a95d13a Use `trio.fail_after` cap in `test_dynamic_pub_sub`
Drop `@pytest.mark.timeout(...)` for the per-test wall-clock
cap on `test_dynamic_pub_sub`; rely on `trio.fail_after(12)`
inside `main()` instead.

Both pytest-timeout enforcement modes are incompatible with
trio under fork-based backends:

- `method='signal'` (SIGALRM) synchronously raises `Failed`
  in trio's main thread mid-`epoll.poll()`, leaving
  `GLOBAL_RUN_CONTEXT` half-installed ("Trio guest run got
  abandoned") so EVERY subsequent `trio.run()` in the same
  pytest process bails with
  `RuntimeError: Attempted to call run() from inside a run()`
  — full-session poison.
- `method='thread'` calls `_thread.interrupt_main()` which
  can let the KBI escape trio's `KIManager` under fork-
  cascade teardown races and bubble out of pytest entirely
  — kills the whole session.

`trio.fail_after()` keeps cancellation inside the trio loop:
- Raises `TooSlowError` cleanly through the open-nursery's
  cancel cascade.
- Doesn't disturb any out-of-band signal/thread state.
- Failure stays scoped to the single test — no cross-test
  global state corruption either way.

Verified empirically: 10 hammer-runs of `test_dynamic_pub_sub`
go from 5/10 fail (with global-state poison) to 3/10 fail
(no poison, all sibling tests still pass). The ~30%
remaining flake rate is a genuine fork-cancel-cascade
hang — separate from this fix but no longer contaminates.

Module-level NOTE comment explains the rationale so future
readers don't re-introduce the bug.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 530160fa69)
2026-06-09 20:27:26 -04:00
Gud Boi 3315a8a292 Add opt-in `reap_subactors_per_test` fixture
Function-scoped, NON-autouse zombie-subactor reaper for
modules whose teardown is known-leaky enough to cascade-
fail every following test in a session.

Sibling to the autouse session-scoped `_reap_orphaned_subactors`. The
session-scoped one fires at session end — too late to save tests that
follow a hung/leaky test in the suite. The new fixture, opted into via
`pytestmark = pytest.mark.usefixtures(...)`, runs between tests in
a problem-module so a leftover subactor from test N can't squat on
registrar ports / UDS paths / shm segments needed by tests N+1,
N+2, ...

Intentionally NOT autouse — the fixture's presence on a module signals
"this module's teardown leaks; please root-cause instead of relying
forever on cleanup". A visibility-vs-convenience trade picked in favor
of the former.

Apply to `tests/test_infected_asyncio.py` since both recent full-suite
runs (parallel-tpt-proto + TCP-only) showed the cascade originating in
this file's KBI- and SIGINT-flavored tests under
`main_thread_forkserver`. Module-comment names the specific offenders so
future de-flake work has a starting point.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b376eb0332)
2026-06-09 20:27:26 -04:00
Gud Boi 5b08c6b034 Sweep `subint_forkserver` → `main_thread_forkserver` in code
After the variant-1 / variant-2 backend split, update remaining
string-match refs to the variant-1 backend so user-visible gates
+ skip-marks + comments name the working backend correctly:

- `tractor._root._DEBUG_COMPATIBLE_BACKENDS`: include
  `main_thread_forkserver`, drop the stub-only `subint_forkserver`
  entry.
- `tests/test_spawning.py::test_loglevel_propagated_to_subactor`:
  capfd-skip flips to `main_thread_forkserver`.
- `tests/test_infected_asyncio.py::test_sigint_closes_lifetime_stack`:
  xfail-condition flips to `main_thread_forkserver`.
- `tests/test_shm.py`: drop stale "broken on `main_thread_forkserver`"
  reason-text since the `mp.SharedMemory(track=False)`
  + resource-tracker monkey-patch in `.ipc._mp_bs` makes the tests pass;
  the skip-mark only fires on plain `subint` now.
- Comment / docstring sweep: `runtime._state`, `runtime._runtime`,
  `_testing.pytest`, `_subint.py`, `pyproject.toml`,
  `test_cancellation.py`, `test_registrar.py` — refs to variant-1
  backend updated.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 205382a39b)
(factored: dropped spawn-backend-only path: tractor/spawn/_subint.py)
2026-06-09 20:27:26 -04:00
Gud Boi 1f7403abc2 Wire `reg_addr` into `test_context_stream_semantics`
Same wire-up pattern as the prior `test_dynamic_pub_sub`
commit: each test that already pulled in `debug_mode`
now also pulls in `reg_addr` and passes
`registry_addrs=[reg_addr]` into `tractor.open_nursery()`,
so the suite's standard registry-addr conventions apply.

Tests touched:
- `test_started_misuse`
- `test_simple_context`
- `test_parent_cancels`
- `test_one_end_stream_not_opened`
- `test_maybe_allow_overruns_stream`
- `test_ctx_with_self_actor`

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 66f1941f46)
2026-06-09 20:27:26 -04:00
Gud Boi ed00b75a7b Wire `test_dynamic_pub_sub` to standard fixtures
Pull in the `reg_addr`, `debug_mode`, and `test_log`
fixtures so this test follows the same conventions as
the rest of the suite:

- pass `registry_addrs=[reg_addr]` + `debug_mode` into
  `tractor.open_nursery()` (so `--tpdb` etc work).
- after the `pytest.raises` block, add `assert err` +
  `test_log.exception('Timed out AS EXPECTED')` so the
  expected timeout is logged explicitly instead of
  swallowed.

Also,
- drop whitespace-only blank lines around the
  `subs` param of `consumer()` and `ctx` param of
  `one_task_streams_and_one_handles_reqresp()`.
- promote `test_sigint_both_stream_types`'s one-line
  docstring to multi-line form.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 9b05f659b3)
2026-06-09 20:27:26 -04:00
Gud Boi 8daf8eeaca Bump `test_stale_entry_is_deleted`'s timeout to 30
Seems that when run in-suite it delays more then the so-measured "happy
path" timing; better to have no suite-global interruption then asserting
a fast single test's run.

(cherry picked from commit 65fcfbf224)
2026-06-09 20:27:26 -04:00
Gud Boi 2e2977b74c Fix `SharedMemory` under `subint_forkserver`
Implements the resolution described in c99d475d's
`subint_forkserver_mp_shared_memory_issue.md` (now
updated with the resolution post-mortem). Two-part
fix that side-steps `mp.resource_tracker` entirely
rather than try to make it fork-safe — turns out
that's both simpler AND more correct given tractor
already SC-manages allocation lifetimes.

Deats,
- `tractor/ipc/_mp_bs.py::disable_mantracker()`: drop the
  `platform.python_version_tuple()[:-1] >= ('3', '13')` branch — patches
  now run unconditionally:
  * monkey-patch `mp.resource_tracker. _resource_tracker` to a no-op
    `ManTracker` subclass (empty `register` / `unregister`
    / `ensure_running`).
  * return `partial(SharedMemory, track=False)` for the per-allocation
    opt-out.
  * belt + suspenders: even if something dodges the wrapper, the
    singleton can't talk to the inherited (broken) parent fd.

- `tractor/ipc/_shm.py::open_shm_list()`: drop the 3.13+ conditional
  skip of the unlink-callback; install a `try_unlink()` wrapper that
  swallows `FileNotFoundError` (sibling-already-cleaned race in
  shared-key setups). Without `mp.resource_tracker` doing it for us, we
  own the unlink — `actor. lifetime_stack` is the right place since
  tractor already controls actor lifecycle.

- `tests/test_shm.py`: uncomment-out `subint_forkserver` from the
  module-level skip- list (tests pass now). Inline comment cross-refs
  the two `_mp_bs` / `_shm` workarounds.

- `ai/conc-anal/subint_forkserver_mp_shared_memory_ issue.md`: heavy
  rewrite — flips status from "open / unresolvable in tractor" to
  "resolved, kept as decision record". Adds Resolution section, "Why
  this is the right call" rationale (mp tracker is widely criticized;
  tractor already owns lifecycle), trade-offs (crash-leaked segments,
  lost mp leak warning), verification (7 passed under both
  `subint_forkserver` and `trio` backends), and upstream issue links

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit aa3e230926)
(factored: dropped subint_forkserver conc-anal doc update)
2026-06-09 20:27:26 -04:00
Gud Boi da0c457ff7 Document `SharedMemory` × `subint_forkserver` incompat
New `ai/conc-anal/` doc: `mp.SharedMemory` is
fork-without-exec unsafe — child inherits parent's
`resource_tracker` fd → EBADF on first shm op;
leaked `/shm_list` cascades `FileExistsError`
across parametrize variants. Canonical CPython
issue class, NOT a tractor bug. Includes two
longer-term mitigation paths (reset inherited
tracker fd vs migrate off `mp.shared_memory`).

Also, update `tests/test_shm.py`:
- comment out `subint_forkserver` from skip list
- rewrite reason with precise failure-mode
  descriptions + link to the analysis doc

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit c99d475d03)
(factored: dropped spawn-backend-only paths: ai/conc-anal/subint_forkserver_mp_shared_memory_issue.md)
2026-06-09 20:27:26 -04:00
Gud Boi 13053f9cbe Skip `test_loglevel_propagated_to_subactor` on subint forkserver too
(cherry picked from commit 2ca0f41e61)
2026-06-09 20:22:23 -04:00
Gud Boi a199aa5096 Wire `reg_addr` through infected-asyncio tests
Continues the hygiene pattern from de601676 (cancel tests) into
`tests/test_infected_asyncio.py`: many tests here were calling
`tractor.open_nursery()` w/o `registry_addrs=[reg_addr]` and thus racing
on the default `:1616` registry across sessions. Thread the
session-unique `reg_addr` through so leaked or slow-to-teardown
subactors from a prior test can't cross-pollute.

Deats,
- add `registry_addrs=[reg_addr]` to `open_nursery()`
  calls in suite where missing.
- `test_sigint_closes_lifetime_stack`:
  - add `reg_addr`, `debug_mode`, `start_method`
    fixture params
  - `delay` now reads the `debug_mode` param directly
    instead of calling `tractor.debug_mode()` (fires
    slightly earlier in the test lifecycle)
  - sanity assert `if debug_mode: assert
    tractor.debug_mode()` after nursery open
  - new print showing SIGINT target
    (`send_sigint_to` + resolved pid)
  - catch `trio.TooSlowError` around
    `ctx.wait_for_result()` and conditionally
    `pytest.xfail` when `send_sigint_to == 'child'
    and start_method == 'subint_forkserver'` — the
    known orphan-SIGINT limitation tracked in
    `ai/conc-anal/subint_forkserver_orphan_sigint_hang_issue.md`
- parametrize id typo fix: `'just_trio_slee'` → `'just_trio_sleep'`

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b350aa09ee)
2026-06-09 20:22:23 -04:00
Gud Boi ba2e474d9d Import-or-skip `.devx.` tests requiring `greenback`
Which is for sure true on py3.14+ rn since `greenlet` didn't want to
build for us (yet).

(cherry picked from commit d6e70e9de4)
2026-06-09 20:22:23 -04:00
Gud Boi e4c7ac34db Default `pytest` to use `--capture=sys`
Lands the capture-pipe workaround from the prior cluster of diagnosis
commits: switch pytest's `--capture` mode from the default `fd`
(redirects fd 1,2 to temp files, which fork children inherit and can
deadlock writing into) to `sys` (only `sys.stdout` / `sys.stderr` — fd
1,2 left alone).

Trade-off documented inline in `pyproject.toml`:
- LOST: per-test attribution of raw-fd output (C-ext writes,
  `os.write(2, ...)`, subproc stdout). Still goes to terminal / CI
  capture, just not per-test-scoped in the failure report.
- KEPT: `print()` + `logging` capture per-test (tractor's logger uses
  `sys.stderr`).
- KEPT: `pytest -s` debugging behavior.

This allows us to re-enable `test_nested_multierrors` without
skip-marking + clears the class of pytest-capture-induced hangs for any
future fork-based backend tests.

Deats,
- `pyproject.toml`: `'--capture=sys'` added to `addopts` w/ ~20 lines of
  rationale comment cross-ref'ing the post-mortem doc

- `test_cancellation`: drop `skipon_spawn_backend('subint_forkserver')`
  from `test_nested_ multierrors` — no longer needed.
  * file-level `pytestmark` covers any residual.

- `tests/spawn/test_subint_forkserver.py`: orphan-SIGINT test's xfail
  mark loosened from `strict=True` to `strict=False` + reason rewritten.
  * it passes in isolation but is session-env-pollution sensitive
    (leftover subactor PIDs competing for ports / inheriting harness
    FDs).
  * tolerate both outcomes until suite isolation improves.

- `test_shm`: extend the existing
  `skipon_spawn_backend('subint', ...)` to also skip
  `'subint_forkserver'`.
  * Different root cause from the cancel-cascade class:
    `multiprocessing.SharedMemory`'s `resource_tracker` + internals
    assume fresh- process state, don't survive fork-without-exec cleanly

- `tests/discovery/test_registrar.py`: bump timeout 3→7s on one test
  (unrelated to forkserver; just a flaky-under-load bump).

- `tractor.spawn._subint_forkserver`: inline comment-only future-work
  marker right before `_actor_child_main()` describing the planned
  conditional stdout/stderr-to-`/dev/null` redirect for cases where
  `--capture=sys` isn't enough (no code change — the redirect logic
  itself is deferred).

EXTRA NOTEs
-----------
The `--capture=sys` approach is the minimum- invasive fix: just a pytest
ini change, no runtime code change, works for all fork-based backends,
trade-offs well-understood (terminal-level capture still happens, just
not pytest's per-test attribution of raw-fd output).

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 4c133ab541)
(factored: dropped spawn-backend-only paths: tests/spawn/test_subint_forkserver.py + tractor/spawn/_subint_forkserver.py; the xfail-loosening bullet above no longer applies)
2026-06-09 20:22:23 -04:00
Gud Boi 828df7df79 Update `subint_forkserver` skip reason: capture-pipe
Refresh the `test_nested_multierrors` skip-mark
reason to the final diagnosis: the hang is pytest's
default `--capture=fd` pipe filling from high-volume
subactor traceback output inherited via fds 1,2 in
fork children — `pytest -s` passes cleanly. Records
the fix direction (redirect child stdio to
`/dev/null` in the fork-child prelude) for whoever
lands the backend.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit eceed29d4a)
(factored: kept only the tests/test_cancellation.py skip-reason update of
 "Pin forkserver hang to pytest `--capture=fd`"; dropped the subint
 conc-anal doc + tests/spawn/test_subint_forkserver.py)
2026-06-09 20:21:58 -04:00
Gud Boi 555f64fdf2 Skip-mark `subint_forkserver` nested-multierror hang
Skip-mark the still-hanging
`test_nested_multierrors[subint_forkserver]` via
`@pytest.mark.skipon_spawn_backend('subint_forkserver',
reason=...)` so it stops blocking the test matrix
while the remaining bug is being chased. The mark is
an inert no-op until that (in-dev) backend lands.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 506617c695)
(factored: kept only the tests/test_cancellation.py skip-mark; dropped
 the subint_forkserver conc-anal doc update)
2026-06-09 20:20:29 -04:00
Gud Boi 9c3fc19f35 Wire `reg_addr` through leaky cancel tests
Stopgap companion to d0121960 (`subint_forkserver`
test-cancellation leak doc): five tests in
`tests/test_cancellation.py` were running against the
default `:1616` registry, so any leaked
`subint-forkserv` descendant from a prior test holds
the port and blows up every subsequent run with
`TooSlowError` / "address in use". Thread the
session-unique `reg_addr` fixture through so each run
picks its own port — zombies can no longer poison
other tests (they'll only cross-contaminate whatever
happens to share their port, which is now nothing).

Deats,
- add `reg_addr: tuple` fixture param to:
  - `test_cancel_infinite_streamer`
  - `test_some_cancels_all`
  - `test_nested_multierrors`
  - `test_cancel_via_SIGINT`
  - `test_cancel_via_SIGINT_other_task`
- explicitly pass `registry_addrs=[reg_addr]` to the
  two `open_nursery()` calls that previously had no
  kwargs at all (in `test_cancel_via_SIGINT` and
  `test_cancel_via_SIGINT_other_task`)
- add bounded `@pytest.mark.timeout(7, method='thread')`
  to `test_nested_multierrors` so a hung run doesn't
  wedge the whole session

Still doesn't close the real leak — the
`subint_forkserver` backend's `_ForkedProc.kill()` is
PID-scoped not tree-scoped, so grandchildren survive
teardown regardless of registry port. This commit is
just blast-radius containment until that fix lands.
See `ai/conc-anal/
subint_forkserver_test_cancellation_leak_issue.md`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 1af2121057)
2026-06-09 20:19:56 -04:00
Gud Boi 668ad69fd2 Mark `subint`-hanging tests with `skipon_spawn_backend`
Adopt the `@pytest.mark.skipon_spawn_backend('subint',
reason=...)` marker (a617b521) across the suites
reproducing the `subint` GIL-contention / starvation
hang classes doc'd in `ai/conc-anal/subint_*_issue.md`.

Deats,
- Module-level `pytestmark` on full-file-hanging suites:
  - `tests/test_cancellation.py`
  - `tests/test_inter_peer_cancellation.py`
  - `tests/test_pubsub.py`
  - `tests/test_shm.py`
- Per-test decorator where only one test in the file
  hangs:
  - `tests/discovery/test_registrar.py
    ::test_stale_entry_is_deleted` — replaces the
    inline `if start_method == 'subint': pytest.skip`
    branch with a declarative skip.
  - `tests/test_subint_cancellation.py
    ::test_subint_non_checkpointing_child`.
- A few per-test decorators are left commented-in-
  place as breadcrumbs for later finer-grained unskips.

Also, some nearby tidying in the affected files:
- Annotate loose fixture / test params
  (`pytest.FixtureRequest`, `str`, `tuple`, `bool`) in
  `tests/conftest.py`, `tests/devx/conftest.py`, and
  `tests/test_cancellation.py`.
- Normalize `"""..."""` → `'''...'''` docstrings per
  repo convention on a few touched tests.
- Add `timeout=6` / `timeout=10` to
  `@tractor_test(...)` on `test_cancel_infinite_streamer`
  and `test_some_cancels_all`.
- Drop redundant `spawn_backend` param from
  `test_cancel_via_SIGINT`; use `start_method` in the
  `'mp' in ...` check instead.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 4b2a0886c3)
(factored: dropped spawn-backend-only path: tests/test_subint_cancellation.py)
2026-06-09 20:19:26 -04:00
Gud Boi 33f1257721 Skip `test_stale_entry_is_deleted` hanger with `subint`s
(cherry picked from commit 985ea76de5)
2026-06-09 20:19:11 -04:00
Gud Boi 154cba86ac Wall-cap `test_stale_entry_is_deleted` via `pytest-timeout`
Add a hard process-level wall-clock bound on a test
known to wedge un-Ctrl-C-ably under an in-dev spawn
backend, so an unattended suite run can't hang
indefinitely.

Deats,
- New `testing` dep: `pytest-timeout>=2.3`.
- `test_stale_entry_is_deleted`:
  `@pytest.mark.timeout(3, method='thread')`. The
  `method='thread'` choice is deliberate —
  `method='signal'` routes via `SIGALRM` which can be
  starved by the same GIL-hostage path that drops
  `SIGINT`, so it'd never actually fire in the
  starvation case.

At timeout, `pytest-timeout` hard-kills the pytest
process itself — that's the intended behavior here;
the alternative is the suite never returning.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 189f4e3f72e9f1eda5d24bcbab5743f7e35bd913)
(factored: kept pyproject + tests/discovery/test_registrar.py parts of
 "Wall-cap `subint` audit tests via `pytest-timeout`"; dropped
 tests/test_subint_cancellation.py)
2026-06-09 20:19:11 -04:00
Gud Boi d60cf23659 Arm `dump_on_hang` on `test_stale_entry_is_deleted`
Wrap the test's `trio.run(main)` in
`dump_on_hang(seconds=20)` so any future hang
regression captures a stack dump for triage instead
of wedging CI silently; under the default backends
it's a no-op safety net.

Includes a "KNOWN ISSUE" comment block documenting
the (future) `subint` backend hang classes observed
against this test during Phase B bringup (#379).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 4a3254583b)
(factored: kept only the tests/discovery/test_registrar.py part of
 "Doc `subint` backend hang classes + arm `dump_on_hang`"; dropped
 subint conc-anal docs + tests/test_subint_cancellation.py)
2026-06-09 20:18:44 -04:00
Gud Boi 9157f58c15 Avoid skip `.ipc._ringbuf` import when no `cffi`
(cherry picked from commit 03bf2b931e)
2026-06-09 20:17:32 -04:00
Gud Boi 4052c5b562 Handle py3.14+ incompats as test skips
Since we're devving subints we require the 3.14+ stdlib API
and a couple compiled libs don't support it yet, namely:
- `cffi`, which we're only using for the `.ipc._linux` eventfd
  stuff (now factored into `hotbaud` anyway).
- `greenback`, which requires `greenlet` which doesn't seem to be
  wheeled yet
  * on nixos the sdist build was failing due to lack of `g++` which
    i don't care to figure out rn since we don't need `.devx` stuff
    immediately for this subints prototype.
  * [ ] we still need to adjust any dependent suites to skip.

Adjust `test_ringbuf` to skip on import failure.

Also project wide,
- pin us to py 3.13+ in prep for last-2-minor-version policy.
- drop `msgspec>=0.20.0`, the first release with py3.14 support.

(cherry picked from commit d2ea8aa2de)
2026-06-09 20:17:20 -04:00