Commit Graph

678 Commits (e7b577bbf6ab0e2c6f65f4b0fcf178f63d092c2c)

Author SHA1 Message Date
Gud Boi e7b577bbf6 Mv `daemon` + `test_multi_program` to `discovery/`
All `daemon` fixture consumers are discovery-
protocol tests now living under `tests/discovery/`.
Move the fixture, its `_wait_for_daemon_ready`
helper, and `test_multi_program.py` into that subdir
so scope matches usage.

Also,
- add `pytestmark` for `track_orphaned_uds_per_test`
  + `detect_runaway_subactors_per_test` to `test_multi_program` as
    regression net.
- drop now-unused `_PROC_SPAWN_WAIT` + `socket` import from root
  conftest.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit c4082be876)
2026-06-09 20:28:04 -04:00
Gud Boi 5a9ae54064 Replace sleep with active poll in `daemon` fixture
First draft at resolving,
https://github.com/goodboy/tractor/issues/424

`tests.conftest.py.daemon()` previously used a blind
`time.sleep(_PROC_SPAWN_WAIT + uds_bonus + ci_bonus)` to "wait for the
daemon to come up" before yielding the proc to the test.

Two problems:

1. **Racy under load** — sleep is fixed at design time; loaded boxes
   / cold starts / fork-spawn cost spikes blow past it, leading to
   `ConnectionRefusedError` /`OSError: connect failed` flakes in
   `test_register_duplicate_name`.

2. **Wasteful when daemon comes up fast** — happy-path pays the FULL
   sleep regardless. ~3s of dead time per fixture invocation, ~10-20s
   per full suite run.

Replace with `_wait_for_daemon_ready()` — active poll via stdlib
`socket.create_connection` (TCP) or `socket.connect` (UDS) on the
daemon's bind addr, with 50ms backoff and a 10s/15s deadline (CI gets
extra headroom). Daemon-died-during-startup early-exit catches the case
where `_PROC_SPAWN_WAIT` was silently masking daemon startup crashes.

Why stdlib `socket` (Option 2 from the conc-anal doc) instead of
`tractor`'s own `_root.ping_tpt_socket` closure or trio?

- `tractor.run_daemon()` doesn't return from bootstrap until the runtime
  is fully ready to handle IPC, so probing listen-side acceptance is
  sufficient.
- no need to do the full IPC handshake just to validate readiness.
  Sidesteps the `trio.run()` bootstrap cost (~50ms) per fixture too.

`claude`'s verification: 10/10 runs of `tests/test_multi_program.py`
pass on both `--tpt-proto=tcp` and `--tpt-proto=uds`. Per-test wall-time
`test_register_duplicate_name`: 4.31s → 1.10s. Full file: ~12s → 3.27s
per transport.

Doc-tracked at:
`ai/conc-anal/test_register_duplicate_name_daemon_connect_race_issue.md`

Future work — session-scoped trio runtime in a bg thread to share
fixture-side trio operations across many fixtures (currently overkill
for the one fixture that needs it).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit ec8c4659c4)
2026-06-09 20:28:04 -04:00
Gud Boi bdde1afaae Harden `test_debugger` for forkserver spawners
Use `is_forking_spawner` fixture + gate spawner-
specific expect patterns in nested-error and daemon
tests. Add `set_fork_aware_capture` to multi-sub
tests that need capture-mode awareness.

Deats,
- replace `start_method` param with `is_forking_spawner` bool fixture.
- bump inter-send delay to 0.1s for IPC stability under fork backends.
- gate `bdb.BdbQuit` + relay-uid patterns behind `not
  is_forking_spawner` (not visible under capsys).
- add `expect(child, EOF)` to confirm clean exit.
- switch caught exc from `AssertionError` to `ValueError` in daemon
  test.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 9031605807)
2026-06-09 20:28:04 -04:00
Gud Boi 45c917c452 Drop global mutation of `_PROC_SPAWN_WAIT`
In top level `daemon`-fixture that is..

Use a local `bg_daemon_spawn_delay` instead of
mutating the module-level `_PROC_SPAWN_WAIT` —
previously each `daemon` fixture invocation would
permanently add 1.6s (UDS) or 1s (CI) to the
global, inflating delays across the session.

Also, emit a `test_log.warning()` when verbose
loglevel is silently reduced to `'info'`.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit c4885f9d99)
2026-06-09 20:28:04 -04:00
Gud Boi 4f042ded23 Add `tractor.trionics.patches` subpkg + first fix
With a seminal patch fixing `trio`'s `WakeupSocketpair.drain()` which
can busy-loop due to lack of handling `EOF`.

New `tractor.trionics.patches` subpkg housing defensive monkey-patches
for upstream `trio` bugs we've encountered while running `tractor`
— particularly as of recent, fork-survival edge cases that haven't been
filed/fixed upstream yet. Each patch is idempotent, version-gated via
`is_needed()`, and carries a `# REMOVE WHEN:` marker pointing at the
upstream release whose adoption allows deletion.

Subpkg layout + per-patch contract documented in
`tractor/trionics/patches/README.md` — `apply()` / `is_needed()`
/ `repro()` API, registry pattern via `_PATCHES` in `__init__.py`,
single-call entry point `apply_all()`.

First patch, `_wakeup_socketpair`:
- `trio`'s `WakeupSocketpair.drain()` loops on `recv(64KB)` and exits
  ONLY on `BlockingIOError`, NEVER on `recv() == b''` (peer-closed FIN).
- under `fork()`-spawning backends the COW-inherited socketpair fds
  & `_close_inherited_fds()` teardown can leave a `WakeupSocketpair`
  instance whose write-end is closed, and `drain()` then **spins forever
  in C with no Python checkpoints**,
- this obviously burns 100% CPU and no signal delivery.

Standalone repro:

    from trio._core._wakeup_socketpair import WakeupSocketpair
    ws = WakeupSocketpair()
    ws.write_sock.close()
    ws.drain()  # spins forever

Patch is one-line — break the drain loop on b'' EOF.

Manifested as two distinct test failures:

- `tests/test_multi_program.py::test_register_duplicate_name` hung at
  100% CPU on the busy-loop directly (fork child's worker thread)
- `tests/test_infected_asyncio.py::test_aio_simple_error` Mode-A
  deadlock — busy-loop wedged trio's scheduler inside `start_guest_run`,
  both threads parked in `epoll_wait`, no TCP connect-back to parent
  ever happened.

Same patch fixes both. Restored 99.7% pass rate on full
suite under `--spawn-backend=main_thread_forkserver`
(was hanging indefinitely before).

Wired into `tractor._child._actor_child_main` via `apply_all()` BEFORE
any trio runtime init. Harmless on non-fork backends.

Conc-anal write-ups, including strace + py-spy evidence:

- `ai/conc-anal/trio_wakeup_socketpair_busy_loop_under_fork_issue.md`
- `ai/conc-anal/infected_asyncio_under_main_thread_forkserver_hang_issue.md`

Regression tests in `tests/trionics/test_patches.py`: each test asserts
(a) the bug exists pre-patch (or is fixed upstream — skip cleanly), (b)
the patch fixes it with a SIGALRM wall-clock cap so a regression hangs
loud instead of silently.

TODO:
- [ ] file the upstream `python-trio/trio` issue + PR.
- [ ] use the `repro()` callable in `_wakeup_socketpair.py` IS the issue
      body's evidence section.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 0ef549fadb)
(factored: dropped spawn-backend-only paths: ai/conc-anal/infected_asyncio_under_main_thread_forkserver_hang_issue.md)
2026-06-09 20:28:04 -04:00
Gud Boi f7c048e535 Adjust `test_shield_pause` for capsys backends
Under `main_thread_forkserver` the bootstrapping
hook switches to `--capture=sys`, so subactor
fd-level output (tree dumps, zombie-reaper msgs)
isn't captured per-test by pexpect. Gate those
expects behind a `no_capfd` check so the test
passes on both capture modes.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 5a9926fc32)
2026-06-09 20:28:04 -04:00
Gud Boi 4d8e67bd7f Default `--ll` to `None` in test harness
Only override `tractor.log._default_loglevel` when
the flag is explicitly passed — lets per-spawn and
per-example `loglevel` kwargs take effect instead
of being clobbered by the hard-coded `'ERROR'`
default.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 72a0465c52)
2026-06-09 20:28:04 -04:00
Gud Boi f18cb0e033 Update debug examples + harden `test_debugger`
Pass explicit `loglevel` to `spawn()` calls in
`test_debugger` tests — required for pexpect
pattern matching now that examples no longer
hard-code log levels.

Also,
- make `expect()` return the decoded `before` str.
- add `start_method` param + fork-backend timeout
  slack (+4s) in nested-error test.
- clean up debug examples: drop unused loglevels,
  rename `n` -> `an`, fix docstrings, add TODO
  comments for tpt parametrize via osenv.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 9431a81d37)
2026-06-09 20:28:04 -04:00
Gud Boi 49a397d6d9 Update `sync_bp` + tighten `test_pause_from_sync`
Add `disable_pdbp_color()` to the `sync_bp` example
to suppress pygments prompt coloring when
`PYTHON_COLORS=0` — makes pexpect pattern matching
deterministic.

Deats,
- set `loglevel='pdb'` in both script + test spawn.
- disable `enable_stack_on_sig` in example, assert
  no `stackscope` output in test.
- update `attach_patts` keys/values with `|_<Task`
  / `|_<Thread` / `|_('subactor'` prefixes to match
  actual tree-dump format.
- add call-site patterns (`tractor.pause_from_sync()`
  `tractor.pause()`, `breakpoint(hide_tb=...)`).
- trim trailing `\n` from `Lock.repr()` output.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit fc2e298a29)
2026-06-09 20:28:04 -04:00
Gud Boi 0df90500fa Fix `SIGUSR1` tree-dump ordering in `_stackscope`
Factor the sub-actor relay loop out of
`dump_tree_on_sig()` into `_relay_sig_to_subactors()`
and chain both dump + relay in a single
`run_sync_soon` callback (`_dump_then_relay`) so the
parent's task-tree flushes BEFORE any sub receives
the signal — fixes a hierarchical-ordering race
where subs could dump ahead of the parent in the
muxed pty stream.

Also,
- gate file/tty sink writes behind `write_file` +
  `write_tty` params on `dump_task_tree()`.
- use `actor.aid.uid` instead of deprecated `.uid`.
- update `test_shield_pause` expects to match the
  new sequential parent -> relay-log -> sub ordering.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit e2b790a70d)
2026-06-09 20:28:04 -04:00
Gud Boi 363d11b89c Add `pytest_load_initial_conftests()` for `--capture=`
Move `--capture=sys` enforcement from a static ini
flag to a `pytest_load_initial_conftests()` bootstrap
hook that dynamically flips capture mode only when a
fork-based spawner (like `main_thread_forkserver`) is
detected; non-fork backends keep `--capture=fd`.

Also,
- load `tractor._testing.pytest` via `-p` in ini
  (bc bootstrapping hooks must register before
  conftest `pytest_plugins` runs).
- register `_reap` as sub-plugin via `pytest_plugins`
  tuple in `._testing.pytest`.
- drop now-duplicate reap fixtures (already in `_reap`
  per 1cdc7fb3).
- rename `tractor_enable_stackscope` dest -> `enable_stackscope`
  and pop env var on disable.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 61d4525137)
2026-06-09 20:28:04 -04:00
Gud Boi d5b10b9e0c Allow per-call `start_method`/`loglevel` overrides
In `tests/devx/conftest.py::spawn`, refactor the
fixture-internal closures so consumer tests can pass
explicit `start_method`/`loglevel` to each `_spawn()`
invocation rather than only inheriting the fixture-
scoped parametrize values.

Deats,
- promote `set_spawn_method()` and `set_loglevel()`
  to take their respective values as fn params (vs
  closing over the fixture-scope vars).
- give `_spawn()` `start_method=start_method` and
  `loglevel: str|None = None` kwargs so callers
  override one-off without re-parametrizing the
  suite. NOTE: this drops the implicit fixture-
  scoped `loglevel` forward — `_spawn()` callers
  now must pass `loglevel=...` explicitly.
- TODO: figure out how `--ll <level>` should map to
  the default (currently `None` → uses env-var or
  tractor default).
- add a docstring to `_spawn()` so its role as the
  consumer-facing closure is obvious from `help()`.

Also,
- `assert_before()` now returns the `.before` output
  on success (was `None`); add a one-line docstring
  describing the new return contract.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 486249d74f)
2026-06-09 20:27:26 -04:00
Gud Boi 6835391c22 Drop test-local timeouts, +`sync_pause` to dev
In `pyproject.toml`,
- include the `sync_pause` group from `dev`, so dev
  installs ship `greenback` for `pause_from_sync()`.

Comment out per-test `@pytest.mark.timeout(...)`
markers in,
- `tests/devx/test_debugger.py`
- `tests/discovery/test_registrar.py`
- `tests/spawn/test_main_thread_forkserver.py`
- `tests/spawn/test_subint_cancellation.py`
- `tests/test_advanced_streaming.py`
- `tests/test_cancellation.py`

The global cap was already dropped (3c366cac); these
were the leftover per-test caps which now block
interactive `pdb` flows under the new spawn backends.

In `uv.lock`,
- pull `greenback` into the resolved `dev` deps
  (per the `sync_pause` include above).
- catch up the prior `xonsh` editable→PyPI switch
  (from the `pyproject.toml` `tool.uv.sources` edit).

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b7115fc875)
(factored: dropped spawn-backend-only paths under tests/spawn/)
2026-06-09 20:27:26 -04:00
Gud Boi fb409055bf Honor `TRACTOR_LOGLEVEL`+`TRACTOR_SPAWN_METHOD` env-vars
Add env-var overrides inside `._root.open_root_actor()` so
devs/test-runs can swap the actor-spawn backend or crank
console verbosity *without* touching application code.

In `._root.open_root_actor()`,
- read `TRACTOR_LOGLEVEL` early, overriding any caller-passed
  `loglevel` and stashing an `env_ll_report` to emit once the
  console log is set up.
- pull the `loglevel` fallback (`or _default_loglevel`) and
  `log.get_console_log()` init *up* so the env-var report
  routes through tractor's own logger.
- read `TRACTOR_SPAWN_METHOD`, overriding any caller-passed
  `start_method` and warn-logging when the env-var clobbers
  an explicit caller value.

Wire the same vars through `tests/devx/conftest.py::spawn`,
- request the `loglevel` fixture, set both `TRACTOR_LOGLEVEL`
  and `TRACTOR_SPAWN_METHOD` in `os.environ` before each
  `pexpect.spawn()` (inherited by the example subproc).
- expand `supported_spawners` to include
  `main_thread_forkserver` and `subint_forkserver` bc
  example scripts no longer need per-script CLI plumbing.
- pop both vars in fixture teardown so a leaked value can't
  re-route a later in-process tractor test's spawn-backend
  or loglevel.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 208e7c0926)
2026-06-09 20:27:26 -04:00
Gud Boi d4e4062bbd Add todo for running `test_debugger` suite on forkserver spawner
(cherry picked from commit 2917b74ba4)
2026-06-09 20:27:26 -04:00
Gud Boi ff7acfcbd6 Backend-aware `fail_after` in pub/sub test
Mirror `060f7d24`'s pattern (backend-aware timeout in
`maybe_expect_raises`) for `test_dynamic_pub_sub`'s hard
`trio.fail_after` cap. Fork-based backends pay per-spawn
fork+IPC-handshake cost which stacks over `cpus - 1`
sequential `n.run_in_actor()` calls; empirically 12s
flakes on `main_thread_forkserver` under UDS
cross-pytest contention (#451 / #452).

Defaults:
- `main_thread_forkserver` → 30s
- everything else          → 12s (unchanged)

Hoist the timeout-pick out of the `main()` closure so the
dispatch happens once in the trio task rather than
re-evaluating per spawn.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 383b0fdd75)
2026-06-09 20:27:26 -04:00
Gud Boi 6f003d7efd Backend-aware timeout in `maybe_expect_raises`
Default `timeout` from `int = 3` → `int|None = None`;
when unset, pick a backend-aware value. Fork-based
backends (`main_thread_forkserver`) need real headroom
bc actor spawn + IPC ctx-exit + msg-validation error
path is much heavier than under `trio` backend —
especially under cross-pytest-stream contention (#451).

Defaults:
- `main_thread_forkserver` → 30s
- everything else          → 3s (unchanged)

Empirical flake history that motivated 30s as the floor
on fork backends (all from `test_basic_payload_spec`):

- 3s  → all-valid variant flaked w/ `TooSlowError`
- 8s  → `invalid-return` variant flaked w/ `Cancelled`
        (surfaced instead of `MsgTypeError` bc the
        outer `fail_after` fired mid-error-path)
- 15s → flaked under cross-pytest-stream contention

30s gives plenty of headroom while still failing-loud
on a genuine hang. Callers can opt out by passing an
explicit `timeout=` kw.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 060f7d24c4)
2026-06-09 20:27:26 -04:00
Gud Boi 2d9a95d13a Use `trio.fail_after` cap in `test_dynamic_pub_sub`
Drop `@pytest.mark.timeout(...)` for the per-test wall-clock
cap on `test_dynamic_pub_sub`; rely on `trio.fail_after(12)`
inside `main()` instead.

Both pytest-timeout enforcement modes are incompatible with
trio under fork-based backends:

- `method='signal'` (SIGALRM) synchronously raises `Failed`
  in trio's main thread mid-`epoll.poll()`, leaving
  `GLOBAL_RUN_CONTEXT` half-installed ("Trio guest run got
  abandoned") so EVERY subsequent `trio.run()` in the same
  pytest process bails with
  `RuntimeError: Attempted to call run() from inside a run()`
  — full-session poison.
- `method='thread'` calls `_thread.interrupt_main()` which
  can let the KBI escape trio's `KIManager` under fork-
  cascade teardown races and bubble out of pytest entirely
  — kills the whole session.

`trio.fail_after()` keeps cancellation inside the trio loop:
- Raises `TooSlowError` cleanly through the open-nursery's
  cancel cascade.
- Doesn't disturb any out-of-band signal/thread state.
- Failure stays scoped to the single test — no cross-test
  global state corruption either way.

Verified empirically: 10 hammer-runs of `test_dynamic_pub_sub`
go from 5/10 fail (with global-state poison) to 3/10 fail
(no poison, all sibling tests still pass). The ~30%
remaining flake rate is a genuine fork-cancel-cascade
hang — separate from this fix but no longer contaminates.

Module-level NOTE comment explains the rationale so future
readers don't re-introduce the bug.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 530160fa69)
2026-06-09 20:27:26 -04:00
Gud Boi 3315a8a292 Add opt-in `reap_subactors_per_test` fixture
Function-scoped, NON-autouse zombie-subactor reaper for
modules whose teardown is known-leaky enough to cascade-
fail every following test in a session.

Sibling to the autouse session-scoped `_reap_orphaned_subactors`. The
session-scoped one fires at session end — too late to save tests that
follow a hung/leaky test in the suite. The new fixture, opted into via
`pytestmark = pytest.mark.usefixtures(...)`, runs between tests in
a problem-module so a leftover subactor from test N can't squat on
registrar ports / UDS paths / shm segments needed by tests N+1,
N+2, ...

Intentionally NOT autouse — the fixture's presence on a module signals
"this module's teardown leaks; please root-cause instead of relying
forever on cleanup". A visibility-vs-convenience trade picked in favor
of the former.

Apply to `tests/test_infected_asyncio.py` since both recent full-suite
runs (parallel-tpt-proto + TCP-only) showed the cascade originating in
this file's KBI- and SIGINT-flavored tests under
`main_thread_forkserver`. Module-comment names the specific offenders so
future de-flake work has a starting point.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b376eb0332)
2026-06-09 20:27:26 -04:00
Gud Boi 5b08c6b034 Sweep `subint_forkserver` → `main_thread_forkserver` in code
After the variant-1 / variant-2 backend split, update remaining
string-match refs to the variant-1 backend so user-visible gates
+ skip-marks + comments name the working backend correctly:

- `tractor._root._DEBUG_COMPATIBLE_BACKENDS`: include
  `main_thread_forkserver`, drop the stub-only `subint_forkserver`
  entry.
- `tests/test_spawning.py::test_loglevel_propagated_to_subactor`:
  capfd-skip flips to `main_thread_forkserver`.
- `tests/test_infected_asyncio.py::test_sigint_closes_lifetime_stack`:
  xfail-condition flips to `main_thread_forkserver`.
- `tests/test_shm.py`: drop stale "broken on `main_thread_forkserver`"
  reason-text since the `mp.SharedMemory(track=False)`
  + resource-tracker monkey-patch in `.ipc._mp_bs` makes the tests pass;
  the skip-mark only fires on plain `subint` now.
- Comment / docstring sweep: `runtime._state`, `runtime._runtime`,
  `_testing.pytest`, `_subint.py`, `pyproject.toml`,
  `test_cancellation.py`, `test_registrar.py` — refs to variant-1
  backend updated.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 205382a39b)
(factored: dropped spawn-backend-only path: tractor/spawn/_subint.py)
2026-06-09 20:27:26 -04:00
Gud Boi 1f7403abc2 Wire `reg_addr` into `test_context_stream_semantics`
Same wire-up pattern as the prior `test_dynamic_pub_sub`
commit: each test that already pulled in `debug_mode`
now also pulls in `reg_addr` and passes
`registry_addrs=[reg_addr]` into `tractor.open_nursery()`,
so the suite's standard registry-addr conventions apply.

Tests touched:
- `test_started_misuse`
- `test_simple_context`
- `test_parent_cancels`
- `test_one_end_stream_not_opened`
- `test_maybe_allow_overruns_stream`
- `test_ctx_with_self_actor`

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 66f1941f46)
2026-06-09 20:27:26 -04:00
Gud Boi ed00b75a7b Wire `test_dynamic_pub_sub` to standard fixtures
Pull in the `reg_addr`, `debug_mode`, and `test_log`
fixtures so this test follows the same conventions as
the rest of the suite:

- pass `registry_addrs=[reg_addr]` + `debug_mode` into
  `tractor.open_nursery()` (so `--tpdb` etc work).
- after the `pytest.raises` block, add `assert err` +
  `test_log.exception('Timed out AS EXPECTED')` so the
  expected timeout is logged explicitly instead of
  swallowed.

Also,
- drop whitespace-only blank lines around the
  `subs` param of `consumer()` and `ctx` param of
  `one_task_streams_and_one_handles_reqresp()`.
- promote `test_sigint_both_stream_types`'s one-line
  docstring to multi-line form.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 9b05f659b3)
2026-06-09 20:27:26 -04:00
Gud Boi 8daf8eeaca Bump `test_stale_entry_is_deleted`'s timeout to 30
Seems that when run in-suite it delays more then the so-measured "happy
path" timing; better to have no suite-global interruption then asserting
a fast single test's run.

(cherry picked from commit 65fcfbf224)
2026-06-09 20:27:26 -04:00
Gud Boi 2e2977b74c Fix `SharedMemory` under `subint_forkserver`
Implements the resolution described in c99d475d's
`subint_forkserver_mp_shared_memory_issue.md` (now
updated with the resolution post-mortem). Two-part
fix that side-steps `mp.resource_tracker` entirely
rather than try to make it fork-safe — turns out
that's both simpler AND more correct given tractor
already SC-manages allocation lifetimes.

Deats,
- `tractor/ipc/_mp_bs.py::disable_mantracker()`: drop the
  `platform.python_version_tuple()[:-1] >= ('3', '13')` branch — patches
  now run unconditionally:
  * monkey-patch `mp.resource_tracker. _resource_tracker` to a no-op
    `ManTracker` subclass (empty `register` / `unregister`
    / `ensure_running`).
  * return `partial(SharedMemory, track=False)` for the per-allocation
    opt-out.
  * belt + suspenders: even if something dodges the wrapper, the
    singleton can't talk to the inherited (broken) parent fd.

- `tractor/ipc/_shm.py::open_shm_list()`: drop the 3.13+ conditional
  skip of the unlink-callback; install a `try_unlink()` wrapper that
  swallows `FileNotFoundError` (sibling-already-cleaned race in
  shared-key setups). Without `mp.resource_tracker` doing it for us, we
  own the unlink — `actor. lifetime_stack` is the right place since
  tractor already controls actor lifecycle.

- `tests/test_shm.py`: uncomment-out `subint_forkserver` from the
  module-level skip- list (tests pass now). Inline comment cross-refs
  the two `_mp_bs` / `_shm` workarounds.

- `ai/conc-anal/subint_forkserver_mp_shared_memory_ issue.md`: heavy
  rewrite — flips status from "open / unresolvable in tractor" to
  "resolved, kept as decision record". Adds Resolution section, "Why
  this is the right call" rationale (mp tracker is widely criticized;
  tractor already owns lifecycle), trade-offs (crash-leaked segments,
  lost mp leak warning), verification (7 passed under both
  `subint_forkserver` and `trio` backends), and upstream issue links

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit aa3e230926)
(factored: dropped subint_forkserver conc-anal doc update)
2026-06-09 20:27:26 -04:00
Gud Boi da0c457ff7 Document `SharedMemory` × `subint_forkserver` incompat
New `ai/conc-anal/` doc: `mp.SharedMemory` is
fork-without-exec unsafe — child inherits parent's
`resource_tracker` fd → EBADF on first shm op;
leaked `/shm_list` cascades `FileExistsError`
across parametrize variants. Canonical CPython
issue class, NOT a tractor bug. Includes two
longer-term mitigation paths (reset inherited
tracker fd vs migrate off `mp.shared_memory`).

Also, update `tests/test_shm.py`:
- comment out `subint_forkserver` from skip list
- rewrite reason with precise failure-mode
  descriptions + link to the analysis doc

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit c99d475d03)
(factored: dropped spawn-backend-only paths: ai/conc-anal/subint_forkserver_mp_shared_memory_issue.md)
2026-06-09 20:27:26 -04:00
Gud Boi 13053f9cbe Skip `test_loglevel_propagated_to_subactor` on subint forkserver too
(cherry picked from commit 2ca0f41e61)
2026-06-09 20:22:23 -04:00
Gud Boi a199aa5096 Wire `reg_addr` through infected-asyncio tests
Continues the hygiene pattern from de601676 (cancel tests) into
`tests/test_infected_asyncio.py`: many tests here were calling
`tractor.open_nursery()` w/o `registry_addrs=[reg_addr]` and thus racing
on the default `:1616` registry across sessions. Thread the
session-unique `reg_addr` through so leaked or slow-to-teardown
subactors from a prior test can't cross-pollute.

Deats,
- add `registry_addrs=[reg_addr]` to `open_nursery()`
  calls in suite where missing.
- `test_sigint_closes_lifetime_stack`:
  - add `reg_addr`, `debug_mode`, `start_method`
    fixture params
  - `delay` now reads the `debug_mode` param directly
    instead of calling `tractor.debug_mode()` (fires
    slightly earlier in the test lifecycle)
  - sanity assert `if debug_mode: assert
    tractor.debug_mode()` after nursery open
  - new print showing SIGINT target
    (`send_sigint_to` + resolved pid)
  - catch `trio.TooSlowError` around
    `ctx.wait_for_result()` and conditionally
    `pytest.xfail` when `send_sigint_to == 'child'
    and start_method == 'subint_forkserver'` — the
    known orphan-SIGINT limitation tracked in
    `ai/conc-anal/subint_forkserver_orphan_sigint_hang_issue.md`
- parametrize id typo fix: `'just_trio_slee'` → `'just_trio_sleep'`

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b350aa09ee)
2026-06-09 20:22:23 -04:00
Gud Boi ba2e474d9d Import-or-skip `.devx.` tests requiring `greenback`
Which is for sure true on py3.14+ rn since `greenlet` didn't want to
build for us (yet).

(cherry picked from commit d6e70e9de4)
2026-06-09 20:22:23 -04:00
Gud Boi e4c7ac34db Default `pytest` to use `--capture=sys`
Lands the capture-pipe workaround from the prior cluster of diagnosis
commits: switch pytest's `--capture` mode from the default `fd`
(redirects fd 1,2 to temp files, which fork children inherit and can
deadlock writing into) to `sys` (only `sys.stdout` / `sys.stderr` — fd
1,2 left alone).

Trade-off documented inline in `pyproject.toml`:
- LOST: per-test attribution of raw-fd output (C-ext writes,
  `os.write(2, ...)`, subproc stdout). Still goes to terminal / CI
  capture, just not per-test-scoped in the failure report.
- KEPT: `print()` + `logging` capture per-test (tractor's logger uses
  `sys.stderr`).
- KEPT: `pytest -s` debugging behavior.

This allows us to re-enable `test_nested_multierrors` without
skip-marking + clears the class of pytest-capture-induced hangs for any
future fork-based backend tests.

Deats,
- `pyproject.toml`: `'--capture=sys'` added to `addopts` w/ ~20 lines of
  rationale comment cross-ref'ing the post-mortem doc

- `test_cancellation`: drop `skipon_spawn_backend('subint_forkserver')`
  from `test_nested_ multierrors` — no longer needed.
  * file-level `pytestmark` covers any residual.

- `tests/spawn/test_subint_forkserver.py`: orphan-SIGINT test's xfail
  mark loosened from `strict=True` to `strict=False` + reason rewritten.
  * it passes in isolation but is session-env-pollution sensitive
    (leftover subactor PIDs competing for ports / inheriting harness
    FDs).
  * tolerate both outcomes until suite isolation improves.

- `test_shm`: extend the existing
  `skipon_spawn_backend('subint', ...)` to also skip
  `'subint_forkserver'`.
  * Different root cause from the cancel-cascade class:
    `multiprocessing.SharedMemory`'s `resource_tracker` + internals
    assume fresh- process state, don't survive fork-without-exec cleanly

- `tests/discovery/test_registrar.py`: bump timeout 3→7s on one test
  (unrelated to forkserver; just a flaky-under-load bump).

- `tractor.spawn._subint_forkserver`: inline comment-only future-work
  marker right before `_actor_child_main()` describing the planned
  conditional stdout/stderr-to-`/dev/null` redirect for cases where
  `--capture=sys` isn't enough (no code change — the redirect logic
  itself is deferred).

EXTRA NOTEs
-----------
The `--capture=sys` approach is the minimum- invasive fix: just a pytest
ini change, no runtime code change, works for all fork-based backends,
trade-offs well-understood (terminal-level capture still happens, just
not pytest's per-test attribution of raw-fd output).

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 4c133ab541)
(factored: dropped spawn-backend-only paths: tests/spawn/test_subint_forkserver.py + tractor/spawn/_subint_forkserver.py; the xfail-loosening bullet above no longer applies)
2026-06-09 20:22:23 -04:00
Gud Boi 828df7df79 Update `subint_forkserver` skip reason: capture-pipe
Refresh the `test_nested_multierrors` skip-mark
reason to the final diagnosis: the hang is pytest's
default `--capture=fd` pipe filling from high-volume
subactor traceback output inherited via fds 1,2 in
fork children — `pytest -s` passes cleanly. Records
the fix direction (redirect child stdio to
`/dev/null` in the fork-child prelude) for whoever
lands the backend.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit eceed29d4a)
(factored: kept only the tests/test_cancellation.py skip-reason update of
 "Pin forkserver hang to pytest `--capture=fd`"; dropped the subint
 conc-anal doc + tests/spawn/test_subint_forkserver.py)
2026-06-09 20:21:58 -04:00
Gud Boi 555f64fdf2 Skip-mark `subint_forkserver` nested-multierror hang
Skip-mark the still-hanging
`test_nested_multierrors[subint_forkserver]` via
`@pytest.mark.skipon_spawn_backend('subint_forkserver',
reason=...)` so it stops blocking the test matrix
while the remaining bug is being chased. The mark is
an inert no-op until that (in-dev) backend lands.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 506617c695)
(factored: kept only the tests/test_cancellation.py skip-mark; dropped
 the subint_forkserver conc-anal doc update)
2026-06-09 20:20:29 -04:00
Gud Boi 9c3fc19f35 Wire `reg_addr` through leaky cancel tests
Stopgap companion to d0121960 (`subint_forkserver`
test-cancellation leak doc): five tests in
`tests/test_cancellation.py` were running against the
default `:1616` registry, so any leaked
`subint-forkserv` descendant from a prior test holds
the port and blows up every subsequent run with
`TooSlowError` / "address in use". Thread the
session-unique `reg_addr` fixture through so each run
picks its own port — zombies can no longer poison
other tests (they'll only cross-contaminate whatever
happens to share their port, which is now nothing).

Deats,
- add `reg_addr: tuple` fixture param to:
  - `test_cancel_infinite_streamer`
  - `test_some_cancels_all`
  - `test_nested_multierrors`
  - `test_cancel_via_SIGINT`
  - `test_cancel_via_SIGINT_other_task`
- explicitly pass `registry_addrs=[reg_addr]` to the
  two `open_nursery()` calls that previously had no
  kwargs at all (in `test_cancel_via_SIGINT` and
  `test_cancel_via_SIGINT_other_task`)
- add bounded `@pytest.mark.timeout(7, method='thread')`
  to `test_nested_multierrors` so a hung run doesn't
  wedge the whole session

Still doesn't close the real leak — the
`subint_forkserver` backend's `_ForkedProc.kill()` is
PID-scoped not tree-scoped, so grandchildren survive
teardown regardless of registry port. This commit is
just blast-radius containment until that fix lands.
See `ai/conc-anal/
subint_forkserver_test_cancellation_leak_issue.md`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 1af2121057)
2026-06-09 20:19:56 -04:00
Gud Boi 668ad69fd2 Mark `subint`-hanging tests with `skipon_spawn_backend`
Adopt the `@pytest.mark.skipon_spawn_backend('subint',
reason=...)` marker (a617b521) across the suites
reproducing the `subint` GIL-contention / starvation
hang classes doc'd in `ai/conc-anal/subint_*_issue.md`.

Deats,
- Module-level `pytestmark` on full-file-hanging suites:
  - `tests/test_cancellation.py`
  - `tests/test_inter_peer_cancellation.py`
  - `tests/test_pubsub.py`
  - `tests/test_shm.py`
- Per-test decorator where only one test in the file
  hangs:
  - `tests/discovery/test_registrar.py
    ::test_stale_entry_is_deleted` — replaces the
    inline `if start_method == 'subint': pytest.skip`
    branch with a declarative skip.
  - `tests/test_subint_cancellation.py
    ::test_subint_non_checkpointing_child`.
- A few per-test decorators are left commented-in-
  place as breadcrumbs for later finer-grained unskips.

Also, some nearby tidying in the affected files:
- Annotate loose fixture / test params
  (`pytest.FixtureRequest`, `str`, `tuple`, `bool`) in
  `tests/conftest.py`, `tests/devx/conftest.py`, and
  `tests/test_cancellation.py`.
- Normalize `"""..."""` → `'''...'''` docstrings per
  repo convention on a few touched tests.
- Add `timeout=6` / `timeout=10` to
  `@tractor_test(...)` on `test_cancel_infinite_streamer`
  and `test_some_cancels_all`.
- Drop redundant `spawn_backend` param from
  `test_cancel_via_SIGINT`; use `start_method` in the
  `'mp' in ...` check instead.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 4b2a0886c3)
(factored: dropped spawn-backend-only path: tests/test_subint_cancellation.py)
2026-06-09 20:19:26 -04:00
Gud Boi 33f1257721 Skip `test_stale_entry_is_deleted` hanger with `subint`s
(cherry picked from commit 985ea76de5)
2026-06-09 20:19:11 -04:00
Gud Boi 154cba86ac Wall-cap `test_stale_entry_is_deleted` via `pytest-timeout`
Add a hard process-level wall-clock bound on a test
known to wedge un-Ctrl-C-ably under an in-dev spawn
backend, so an unattended suite run can't hang
indefinitely.

Deats,
- New `testing` dep: `pytest-timeout>=2.3`.
- `test_stale_entry_is_deleted`:
  `@pytest.mark.timeout(3, method='thread')`. The
  `method='thread'` choice is deliberate —
  `method='signal'` routes via `SIGALRM` which can be
  starved by the same GIL-hostage path that drops
  `SIGINT`, so it'd never actually fire in the
  starvation case.

At timeout, `pytest-timeout` hard-kills the pytest
process itself — that's the intended behavior here;
the alternative is the suite never returning.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 189f4e3f72e9f1eda5d24bcbab5743f7e35bd913)
(factored: kept pyproject + tests/discovery/test_registrar.py parts of
 "Wall-cap `subint` audit tests via `pytest-timeout`"; dropped
 tests/test_subint_cancellation.py)
2026-06-09 20:19:11 -04:00
Gud Boi d60cf23659 Arm `dump_on_hang` on `test_stale_entry_is_deleted`
Wrap the test's `trio.run(main)` in
`dump_on_hang(seconds=20)` so any future hang
regression captures a stack dump for triage instead
of wedging CI silently; under the default backends
it's a no-op safety net.

Includes a "KNOWN ISSUE" comment block documenting
the (future) `subint` backend hang classes observed
against this test during Phase B bringup (#379).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 4a3254583b)
(factored: kept only the tests/discovery/test_registrar.py part of
 "Doc `subint` backend hang classes + arm `dump_on_hang`"; dropped
 subint conc-anal docs + tests/test_subint_cancellation.py)
2026-06-09 20:18:44 -04:00
Gud Boi 9157f58c15 Avoid skip `.ipc._ringbuf` import when no `cffi`
(cherry picked from commit 03bf2b931e)
2026-06-09 20:17:32 -04:00
Gud Boi 4052c5b562 Handle py3.14+ incompats as test skips
Since we're devving subints we require the 3.14+ stdlib API
and a couple compiled libs don't support it yet, namely:
- `cffi`, which we're only using for the `.ipc._linux` eventfd
  stuff (now factored into `hotbaud` anyway).
- `greenback`, which requires `greenlet` which doesn't seem to be
  wheeled yet
  * on nixos the sdist build was failing due to lack of `g++` which
    i don't care to figure out rn since we don't need `.devx` stuff
    immediately for this subints prototype.
  * [ ] we still need to adjust any dependent suites to skip.

Adjust `test_ringbuf` to skip on import failure.

Also project wide,
- pin us to py 3.13+ in prep for last-2-minor-version policy.
- drop `msgspec>=0.20.0`, the first release with py3.14 support.

(cherry picked from commit d2ea8aa2de)
2026-06-09 20:17:20 -04:00
Gud Boi 3867403fab Scale `test_open_local_sub_to_stream` timeout by CPU factor
Import and apply `cpu_scaling_factor()` from
`conftest`; bump base from 3.6 -> 4 and multiply
through so CI boxes with slow CPUs don't flake.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-16 20:03:32 -04:00
Gud Boi ed65301d32 Fix misc bugs caught by Copilot review
Deats,
- use `proc.poll() is None` in `sig_prog()` to
  distinguish "still running" from exit code 0;
  drop stale `breakpoint()` from fallback kill
  path (would hang CI).
- add missing `raise` on the `RuntimeError` in
  `async_main()` when no tpt bind addrs given.
- clean up stale uid entries from the registrar
  `_registry` when addr eviction empties the
  addr list.
- update `discovery.__init__` docstring to match
  the new eager `._multiaddr` import.
- fix `registar` -> `registrar` typo in teardown
  report log msg.

Review: PR #429 (Copilot)
https://github.com/goodboy/tractor/pull/429

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:15 -04:00
Gud Boi cd287c7e93 Fix `test_registrar_merge_binds_union` for UDS collision
`get_random()` can produce the same UDS filename for a given
pid+actor-state, so the "disjoint addrs" premise doesn't always hold.
Gate the `len(bound) >= 2` assertion on whether the registry and bind
addrs actually differ via `expect_disjoint`.

Also,
- drop unused `partial` import

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:15 -04:00
Gud Boi 86d4e0d3ed Harden `sig_prog()` retries, adjust debugger test timeouts
Retry signal delivery in `sig_prog()` up to `tries`
times (default 3) w/ `canc_timeout` sleep between
attempts; only fall back to `_KILL_SIGNAL` after all
retries exhaust. Bump default timeout 0.1 -> 0.2.

Also,
- `test_multi_nested_subactors_error_through_nurseries`
  gives the first prompt iteration a 5s timeout even
  on linux bc the initial crash sequence can be slow
  to arrive at a `pdb` prompt

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi c3d6cc9007 Rename `discovery._discovery` to `._api`
Adjust all imports to match.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi e90241baaa Add `parse_endpoints()` to `_multiaddr`
Provide a service-table parsing API for downstream projects (like
`piker`) to declare per-actor transport bind addresses as a config map
of actor-name -> multiaddr strings (e.g. from a TOML `[network]`
section).

Deats,
- `EndpointsTable` type alias: input `dict[str, list[str|tuple]]`.
- `ParsedEndpoints` type alias: output `dict[str, list[Address]]`.
- `parse_endpoints()` iterates the table and delegates each entry to the
  existing `tractor.discovery._discovery.wrap_address()` helper, which
  handles maddr strings, raw `(host, port)` tuples, and pre-wrapped
  `Address` objs.
- UDS maddrs use the multiaddr spec name `/unix/...` (not tractor's
  internal `/uds/` proto_key)

Also add new tests,
- 7 new pure unit tests (no trio runtime): TCP-only, mixed tpts,
  unwrapped tuples, mixed str+tuple, unsupported proto (`/udp/`),
  empty table, empty actor list
- all 22 multiaddr tests pass rn.

Prompt-IO:
ai/prompt-io/claude/20260413T205048Z_269d939c_prompt_io.md

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 7079a597c5 Add `test_tpt_bind_addrs.py` + fix type-mixing bug
Add 9 test variants (6 fns) covering all three
`tpt_bind_addrs` code paths in `open_root_actor()`:
- registrar w/ explicit bind (eq, subset, disjoint)
- non-registrar w/ explicit bind (same/diff
  bindspace) using `daemon` fixture
- non-registrar default random bind (baseline)
- maddr string input parsing
- registrar merge produces union
- `open_nursery()` forwards `tpt_bind_addrs`

Fix type-mixing bug at `_root.py:446` where the
registrar merge path did `set(Address + tuple)`,
preventing dedup and causing double-bind `OSError`.
Wrap `uw_reg_addrs` before the set union so both
sides are `Address` objs.

Also,
- add prompt-io output log for this session
- stage original prompt input for tracking

Prompt-IO: ai/prompt-io/claude/20260413T192116Z_f851f28_prompt_io.md

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi f881683c97 Tweak timeouts and rm `arbiter_addr` in tests
Use `cpu_scaling_factor()` headroom in
`test_peer_spawns_and_cancels_service_subactor`'s `fail_after` to avoid
flaky timeouts on throttled CI runners. Rename `arbiter_addr=` ->
`registry_addrs=[..]` throughout `test_spawning` and
`test_task_broadcasting` suites to match the current `open_root_actor()`
/ `open_nursery()` API.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 490fac432c Preserve absolute UDS paths in `parse_maddr()`
Drop the `.lstrip('/')` on the unix protocol value
so the lib-prepended `/` restores the absolute-path
semantics that `mk_maddr()` strips when encoding.
Pass `Path` components (not `str`) to `UDSAddress`.

Also, update all UDS test params to use absolute
paths (`/tmp/tractor_test/...`, `/tmp/tractor_rt/...`)
matching real runtime sockpath behavior; tighten
`test_parse_maddr_uds` to assert exact `filedir`.

Review: PR #429 (copilot-pull-request-reviewer[bot])
https://github.com/goodboy/tractor/pull/429#pullrequestreview-4018448152

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 5c4438bacc Add `parse_maddr()` tests + registrar maddr integ test
Cover `parse_maddr()` with unit tests for tcp/ipv4,
tcp/ipv6, uds, and unsupported-protocol error paths,
plus full `addr -> mk_maddr -> str -> parse_maddr`
roundtrip verification.

Adds,
- a `_maddr_to_tpt_proto` inverse-mapping assertion.
- an `wrap_address()` maddr-string acceptance test.
- a `test_reg_then_unreg_maddr` end-to-end suite which audits passing
  the registry addr as multiaddr str through the entire runtime.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 1f1e09a786 Move `test_discovery` to `tests/discovery/test_registrar`
All tests are registrar-actor integration scenarios
sharing intertwined helpers + `enable_modules=[__name__]`
task fns, so keep as one mod but rename to reflect
content. Now lives alongside `test_multiaddr.py` in
the new `tests/discovery/` subpkg.

Also,
- update 5 refs in `/run-tests` SKILL.md to match
  the new path
- add `discovery/` subdir to the test directory
  layout tree

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00
Gud Boi 7cf3b5d00d Add `test_multiaddr.py` suite for `mk_maddr()`
Cover `_tpt_proto_to_maddr` mapping, TCP (ipv4/ipv6),
UDS, unsupported `proto_key` error, and round-trip
re-parse for both transport types.

Deats,
- new `tests/discovery/` subpkg w/ empty `__init__.py`
- `test_tpt_proto_to_maddr_mapping`: verify `tcp` and
  `uds` entries
- `test_mk_maddr_tcp_ipv4`: full assertion on
  `/ip4/127.0.0.1/tcp/1234` incl protocol iteration
- `test_mk_maddr_tcp_ipv6`: verify `/ip6/::1/tcp/5678`
- `test_mk_maddr_uds`: relative `filedir` bc the
  multiaddr parser rejects double-slash from abs paths
- `test_mk_maddr_unsupported_proto_key`: `ValueError`
  on `proto_key='quic'` via `SimpleNamespace` mock
- `test_mk_maddr_roundtrip`: parametrized over tcp +
  uds, re-parse `str(maddr)` back through `Multiaddr`

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-04-14 19:54:14 -04:00