tractor

Commit Graph

Author	SHA1	Message	Date
Gud Boi	91f2f3ec10	Use test-harness `loglevel` in inter-peer suite	2026-02-19 16:29:20 -05:00
Gud Boi	fa86269e30	Stuff from auto-review in https://github.com/goodboy/tractor/pull/412 ..	2026-02-19 16:20:21 -05:00
Gud Boi	9470815f5a	Fix `spawn` fixture cleanup + test assertions Improve the `spawn` fixture teardown logic in `tests/devx/conftest.py` fixing the while-else bug, and fix `test_advanced_faults` genexp for `TransportClosed` exc type checking. Deats, - replace broken `while-else` pattern with direct `if ptyproc.isalive()` check after the SIGINT loop. - fix undefined `spawned` ref -> `ptyproc.isalive()` in while condition. - improve walrus expr formatting in timeout check (multiline style). Also fix `test_ipc_channel_break_during_stream()` assertion, - wrap genexp in `all()` call so it actually checks all excs are `TransportClosed` instead of just creating an unused generator. (this patch was suggested by copilot in, https://github.com/goodboy/tractor/pull/411) (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-19 16:14:11 -05:00
Gud Boi	592d918394	Tweak `test_inter_peer_cancellation` for races Adjust `basic_echo_server()` default sequence len to avoid the race where the 'tell_little_bro()` finished streaming before the echo-server sub is cancelled by its peer subactor (which is the whole thing we're testing!). Deats, - bump `rng_seed` default from 50 -> 100 to ensure peer cancel req arrives before echo dialog completes on fast hw. - add `trio.sleep(0.001)` between send/receive in msg loop on the "client" streamer side to give cancel request transit more time to arrive. Also, - add more native `tractor`-type hints. - reflow `basic_echo_server()` doc-string for 67 char limit - add masked `pause()` call with comment about unreachable code path - alphabetize imports: mv `current_actor` and `open_nursery` below typed imports (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-19 15:24:42 -05:00
Gud Boi	50f40f427b	Include `TransportClosed` in tpt-layer err handling Add `TransportClosed` to except clauses where `trio`'s own resource-closed errors are already caught, ensuring our higher-level tpt exc is also tolerated in those same spots. Likely i will follow up with a removal of the `trio` variants since most should be caught and re-raised as tpt-closed out of the `.ipc` stack now? Add `TransportClosed` to various handler blocks, - `._streaming.MsgStream.aclose()/.send()` except blocks. - the broken-channel except in `._context.open_context_from_portal()`. - obvi import it where necessary in those ^ mods. Adjust `test_advanced_faults` suite + exs-script to match, - update `ipc_failure_during_stream.py` example to catch `TransportClosed` alongside `trio.ClosedResourceError` in both the break and send-check paths. - shield the `trio.sleep(0.01)` after tpt close in example to avoid taskc-raise/masking on that checkpoint since we want to simulate waiting for a user to send a KBI. - loosen `ExceptionGroup` assertion to `len(excs) <= 2` and ensure all excs are `TransportClosed`. - improve multi-line formatting, minor style/formatting fixes in condition expressions. (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-19 13:55:02 -05:00
Gud Boi	bf6de55865	Improve tpt-closed msg-fmt/content and CRE case matching Refine tpt-error reporting to include closure attribution (`'locally'` vs `'by peer'`), tighten match conditions and reduce needless newlines in exc reprs. Deats, - factor out `trans_err_msg: str` and `by_whom: str` into a `dict` lookup before the `match:` block to pair specific err msgs to closure attribution strings. - use `by_whom` directly as `CRE` case guard condition (truthy when msg matches known underlying CRE msg content). - conveniently include `by_whom!r` in `TransportClosed` message. - fix `'locally ?'` -> `'locally?'` in send-side `CRE` handler (drop errant space). - add masked `maybe_pause_bp()` calls at both `CRE` sites (from when i was tracing a test harness issue where the UDS socket path wasn't being cleaned up on teardown). - drop trailing `\n` from `body=` args to `TransportClosed`. - reuse `trans_err_msg` for the `BRE`/broken-pipe guard. Also adjust testing, namely `test_ctxep_pauses_n_maybe_ipc_breaks`'s expected patts-set for new msg formats to be raised out of `.ipc._transport`. (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-19 13:55:02 -05:00
Gud Boi	7145fa364f	Add `SIGINT` cleanup to `spawn` fixture in `devx/conftest` Convert `spawn` fixture to a generator and add post-test graceful subproc cleanup via `SIGINT`/`SIGKILL` to avoid leaving stale `pexpect` child procs around between test runs as well as any UDS-tpt socket files under the system runtime-dir. Deats, - convert `return _spawn` -> `yield _spawn` to enable post-yield teardown logic. - add a new `nonlocal spawned` ref so teardown logic can access the last spawned child from outside the delivered spawner fn-closure. - add `SIGINT`-loop after yield with 5s timeout, then `SIGKILL` if proc still alive. - add masked `breakpoint()` and TODO about UDS path cleanup (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-19 13:55:02 -05:00
Gud Boi	6cab363c51	Catch-n-fail on stale `_root_addrs` state.. Turns out we aren't clearing the `._state._runtime_vars` entries in between `open_root_actor` calls.. This test refinement catches that by adding runtime-vars asserts on the expected root-addrs value; ensure `_runtime_vars['_root_addrs'] ONLY match the values provided by the test's CURRENT root actor. This causes a failure when the (just added) `test_non_registrar_spawns_child` is run as part of the module suite, it's fine when run standalone.	2026-02-11 22:17:26 -05:00
Gud Boi	cdcc1b42fc	Add test for non-registrar root sub-spawning Ensure non-registrar root actors can spawn children and that those children receive correct parent contact info. This test catches the bug reported in, https://github.com/goodboy/tractor/issues/410 Add new `test_non_registrar_spawns_child()` which spawns a sub-actor from a non-registrar root and verifies the child can manually connect back to its parent using `get_root()` API, auditing `._state._runtime_vars` addr propagation from rent to child. Also, - improve type hints throughout test suites (`subprocess.Popen`, `UnwrappedAddress`, `Aid` etc.) - rename `n` -> `an` for actor nursery vars - use multiline style for function signatures (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-11 22:17:26 -05:00
Gud Boi	5850844297	Mk `test_implicit_mod_name_applied_for_child()` check init-mods Test pkg-level init module and sub-pkg module logger naming to better validate auto-naming logic. Deats, - create `pkg_init_mod` and write `mod_code` to it for testing pkg-level `__init__.py` logger instance creation. * assert `snakelib.__init__` logger name is `proj_name`. - write `mod_code` to `subpkg/__init__.py`` as well and check the same. Also, - rename some vars, * `pkg_mod` -> `pkg_submod`, * `pkgmod` -> `subpkgmod` - add `ModuleType` import for type hints - improve comments explaining pkg init vs first-level sub-module naming expectations. - drop trailing whitespace and unused TODO comment - remove masked `breakpoint()` call (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-11 21:43:37 -05:00
Gud Boi	d61e8caab2	Improve `test_log_sys` for new auto-naming logic Add assertions and comments to better test the reworked implicit module-name detection in `get_logger()`. Deats, - add `assert not tractor.current_actor()` check to verify no runtime is active during test. - import `.log` submod directly for use. - add masked `breakpoint()` for debugging mod loading. - add comment about using `ranger` to inspect `testdir` layout of auto-generated py pkg + module-files. - improve comments explaining pkg-root-log creation. - add TODO for testing `get_logger()` call from pkg `__init__.py` - add comment about first-pkg-level module naming. (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-11 21:05:07 -05:00
Gud Boi	0b0c83e9da	Drop `name=__name__` from all `get_logger()` calls Use new implicit module-name detection throughout codebase to simplify logger creation and leverage auto-naming from caller mod . Main changes, - drop `name=__name__` arg from all `get_logger()` calls (across 29 modules). - update `get_console_log()` calls to include `name='tractor'` for enabling root logger in test harness and entry points; this ensures logic in `get_logger()` triggers so that all `tractor`-internal logging emits to console. - add info log msg in test `conftest.py` showing test-harness log level Also, - fix `.actor.uid` ref to `.actor.aid.uid` in `._trace`. - adjust a `._context` log msg formatting for clarity. - add TODO comments in `._addr`, `._uds` for when we mv to using `multiaddr`. - add todo for `RuntimeVars` type hint TODO in `.msg.types` (once we eventually get that all going obvi!) (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code	2026-02-11 21:04:49 -05:00
Gud Boi	edf1189fe0	Multi-line styling in `test.devx.conftest`	2026-02-11 21:04:22 -05:00
Tyler Goodlet	de24bfe052	Mv `load_module_from_path()` to a new `._code_load` submod	2026-02-11 21:03:29 -05:00
Tyler Goodlet	e235b96894	Use new `pkg_name` in log-sys test suites	2026-02-11 21:03:07 -05:00
Tyler Goodlet	557e2cec6a	Add an implicit-pkg-path-as-logger-name test A bit of test driven dev to anticipate support of `.log.get_logger()` usage such that it can be called from arbitrary sub-modules, themselves embedded in arbitrary sub-pkgs, of some project; the when not provided, the `sub_name` passed to the `Logger.getChild(<sub_name>)` will be set as the sub-pkg path "down to" the calling module. IOW if you call something like, `log = tractor.log.get_logger(pkg_name='mypylib')` from some `submod.py` in a project-dir that looks like, mypylib/ mod.py subpkg/ submod.py <- calling module the `log: StackLevelAdapter` child-`Logger` instance will have a `.name: str = 'mypylib.subpkg'`, discluding the `submod` part since this already rendered as the `{filename}` header in `log.LOG_FORMAT`. Previously similar behaviour would be obtained by passing `get_logger(name=__name__)` in the calling module and so much so it motivated me to make this the default, presuming we can introspect for the info. Impl deats, - duplicated a `load_module_from_path()` from `modden` to load the `testdir` rendered py project dir from its path. \|_should prolly factor it down to this lib anyway bc we're going to need it for hot code reload? (well that and `watchfiles` Bp) - in each of `mod.py` and `submod.py` render the `get_logger()` code sin `name`, expecting the (coming shortly) implicit introspection feat to do this. - do `.name` and `.parent` checks against expected sub-logger values from `StackLevelAdapter.logger.getChildren()`.	2026-02-11 21:03:07 -05:00
Tyler Goodlet	0e3229f16d	Start a logging-sys unit-test module To start ensuring that when `name=__name__` is passed we try to de-duplicate the `_root_name` and any `leaf_mod: str` since it's already included in the headers as `{filename}`. Deats, - heavily document the de-duplication `str.partition()`s in `.log.get_logger()` and provide the end fix by changing the predicate, `if rname == 'tractor':` -> `if rname == _root_name`. * also toss in some warnings for when we still detect duplicates. - add todo comments around logging "filters" (vs. our "adapter"). - create the new `test_log_sys.test_root_pkg_not_duplicated()` which runs green with the fixes from ^. - add a ton of test-suite todos both for existing and anticipated logging sys feats in the new mod.	2026-02-11 21:03:07 -05:00
Tyler Goodlet	9f757ffa63	Woops, fix missing `assert` thanks to copilot	2025-09-11 13:13:18 -04:00
Tyler Goodlet	73423ef2b7	Timeout on `test_peer_spawns_and_cancels_service_subactor` While working on a fix to the hang case found from `test_cancel_ctx_with_parent_side_entered_in_bg_task` an initial solution caused this test to hang indefinitely; solve it with a small wrapping `_main()` + `trio.fail_after()` entrypoint. Further suite refinements, - move the top-most `try:`->`else:` block - toss in a masked base-exc block for tracing unexpected `ctx.wait_for_result()` outcomes. - tweak the `raise_sub_spawn_error_after` to be an optional `float` which scales the `rng_seed: int = 50` msg counter to `tell_little_bro()` so that the abs value to the `range()` can be changed.	2025-09-11 10:13:04 -04:00
Tyler Goodlet	9489a2f84d	Add timeout around `test_peer_spawns_and_cancels_service_subactor` suite	2025-09-11 10:13:04 -04:00
Tyler Goodlet	92eaed6fec	Parametrize with `Portal.cancel_actor()` only case Such that when `maybe_context.cancel()` is not called (explicitly) and only the subactor is cancelled by its parent we expect to see a ctxc raised both from any call to `Context.wait_for_result()` and out of the `Portal.open_context()` scope, up to the `trio.run()`. Deats, - obvi call-n-catch the ctxc (in scope) for the oob-only subactor-cancelled case. - add branches around `trio.run()` entry to match.	2025-09-11 10:13:04 -04:00
Tyler Goodlet	217d54b9d1	Add the minimal OoB cancel edge case from #391 Discovered while writing a `@context` sanity test to verify unmasker ignore-cases support. Masked code is due to the process of finding the minimal example causing the original hang discovered in the original examples script. Details are in the test-fn doc strings and surrounding comments; more refinement and cleanup coming obviously. Also moved over the self-cancel todos from the inter-peer tests module.	2025-09-11 10:13:04 -04:00
Tyler Goodlet	62a364a1d3	Tweaks from copilot, type fix, typos, language.	2025-09-11 10:01:25 -04:00
Tyler Goodlet	9c6b90ef04	Add a ignore-masking-case script + suite Demonstrating the guilty `trio.Lock.acquire()` impl which puts a checkpoint inside its `trio.WouldBlock` handler and which will always appear to mask the "sync path" case on (graceful) cancellation. This first script draft demos the issue from within a `tractor.context` ep bc that's where it was orig discovered, however i'm going to factor out the `tractor` code and instead just use a `.trionics.maybe_raise_from_masking_exc()` to demo its low-level ignore-case feature. Further, this script exposed a previously unhandled remote graceful cancellation case which hangs: - parent actor spawns child and opens a >1 ctxs with it, - the parent then OoB (out-of-band) cancels the child actor (with `Portal.cancel_actor()`), - since the open ctxs raise a ctxc with a `.canceller == parent.uid` the `Context._is_self_cancelled()` will eval `True`, - the `Context._scope` will NOT be cancelled in `._maybe_cancel_and_set_remote_error()` resulting in any bg-task which is waiting on a `Portal.open_context()` to not be cancelled/unblocked. So my plan is to factor this ^^ scenario into a standalone unit test as well as another test which consumes from al low-level `trio`-only version of this script-scenario to sanity check the interaction of the unmasker-with-ignore-cases usage implicitly around a ctx ep.	2025-09-06 14:03:02 -04:00
Tyler Goodlet	542d4c7840	Ignore `examples/trio/` in docs-examples test suite	2025-09-06 13:39:08 -04:00
Tyler Goodlet	04c3d5e239	Wrap `send_chan_aclose_masks_beg.py` as test suite Call it `test_trioisms::test_unmask_aclose_as_checkpoint_on_aexit` and parametrize all script-mod`.main()` toggles including `.xfails()` for the `raise_unmasked=False` cases.	2025-09-05 18:46:20 -04:00
Tyler Goodlet	25c5847f2e	Drop `tn` input from `maybe_raise_from_masking_exc()` Including all caller usage throughout. Moving to a non-`except*` impl means it's never needed as a signal from the caller - we can just catch the beg outright (like we should have always been doing)..	2025-08-20 12:45:49 -04:00
Tyler Goodlet	d17864a432	Adjust test suites to new `maybe_raise_from_masking_exc()` changes	2025-08-20 12:45:49 -04:00
Tyler Goodlet	ee32bc433c	Add a root-already-cancelled crash handling test Such that we audit the `shield=root_tn.cancel_scope.cancel_called,` passed to `await debug._maybe_enter_pm()` in the `open_root_actor()` exit handler block.	2025-08-20 10:18:52 -04:00
Tyler Goodlet	6e4c76245b	Add LoC pattern matches for `test_post_mortem_api`	2025-08-19 14:14:27 -04:00
Tyler Goodlet	b74e93ee55	Change one infected-aio test to use `chan` in fn sig	2025-08-18 22:32:51 -04:00
Tyler Goodlet	4a7491bda4	Add "raises-pre-started" `open_channel_from()` test Verifying that if any exc is raised pre `chan.send_nowait()` (our currentlly shite version of a `chan.started()`) then that exc is indeed raised through on the `trio`-parent task side. This case was reproduced from a `piker.brokers.ib` issue with a similar embedded `.trionics.maybe_open_context()` call. Deats, - call the suite `test_aio_side_raises_before_started`. - mk the `@context` simply `maybe_open_context(acm_func=open_channel_from)` with a `target=raise_before_started` which, - simply sleeps then immediately raises a RTE. - expect the RTE from the aio-child-side to propagate all the way up to the root-actor's task right up through the `trio.run()`.	2025-08-18 22:32:51 -04:00
Tyler Goodlet	79f502034f	Don't hard code runtime-dir, read it with `._state.get_rt_dir()`	2025-08-18 21:30:48 -04:00
Tyler Goodlet	00112edd58	UDS: implicitly create `Address.bindspace: Path` Since it's merely a local-file-sys subdirectory and there should be no reason file creation conflicts with other bind spaces. Also add 2 test suites to match, - `tests/ipc/test_each_tpt::test_uds_bindspace_created_implicitly` to verify the dir creation when DNE. - `..test_uds_double_listen_raises_connerr` to ensure a double bind raises a `ConnectionError` from the src `OSError`.	2025-08-18 21:30:48 -04:00
Tyler Goodlet	4ba3590450	Add `.trionics.maybe_open_context()` locking test Call it `test_lock_not_corrupted_on_fast_cancel()` and includes a detailed doc string to explain. Implemented it "cleverly" by having the target `@acm` cancel its parent nursery after a peer, cache-hitting task, is already waiting on the task mutex release.	2025-08-18 21:07:12 -04:00
Tyler Goodlet	70664b98de	Well then, I guess it just needed, a checkpoint XD Here I was thinking the bcaster (usage) maybe required a rework but, NOPE it's just bc a checkpoint was needed in the parent task owning the `tn` which spawns `get_sub_and_pull()` tasks to ensure the bg allocated `an`/portal is eventually cancel-called.. Ah well, at least i started a patch for `MsgStream.subscribe()` to make it multicast revertible.. XD Anyway, I tossed in some checks & notes related to all that unnecessary effort since I do think i'll move forward implementing it: - for the `cache_hit` case always verify that the `bcast` clone is unregistered from the common state subs after `.subscribe().__aexit__()`. - do a light check that the implicit `MsgStream._broadcaster` is always the only bcrx instance left-leaked into that state.. that is until i get the proper de-allocation/reversion from multicast -> unicast working. - put in mega detailed note about the required parent-task checkpoint.	2025-08-18 21:07:12 -04:00
Tyler Goodlet	1c425cbd22	Tool-up `test_resource_cache.test_open_local_sub_to_stream` Since I recently discovered a very subtle race-case that can sometimes cause the suite to hang, seemingly due to the `an: ActorNursery` allocated behind the `.trionics.maybe_open_context()` usage; this can result in never cancelling the 'streamer' subactor despite the `main()` timeout-guard? This led me to dig in and find that the underlying issue was 2-fold, - our `BroadcastReceiver` termination-mgmt semantics in `MsgStream.subscribe()` can result in the first subscribing task to always keep the `MsgStream._broadcaster` instance allocated; it's never `.aclose()`ed, which makes it tough to determine (and thus trace) when all subscriber-tasks are actually complete and exited-from-`.subscribe()`.. - i was shield waiting `.ipc._server.Server.wait_for_no_more_peers()` in `._runtime.async_main()`'s shutdown sequence which would then compound the issue resulting in a SIGINT-shielded hang.. the worst kind XD Actual changes here are just styling, printing, and some mucking with passing the `an`-ref up to the parent task in the root-actor where i was doing a conditional `ActorNursery.cancel()` to mk sure that was actually the problem. Presuming this is fixed the `.pause()` i left unmasked should never hit.	2025-08-18 21:07:06 -04:00
Tyler Goodlet	88c1c083bd	Add timeout to inf-streamer test	2025-08-18 13:31:15 -04:00
Tyler Goodlet	b096867d40	Remove lingering seg=False-flags from tests	2025-08-18 12:03:32 -04:00
Tyler Goodlet	0ffcea1033	Adjust `test_trio_prestarted_task_bubbles()` suite to expect non-eg raises	2025-08-18 10:46:37 -04:00
Tyler Goodlet	a7bdf0486c	Styling tweaks to quadruple streaming test fn	2025-08-18 10:46:37 -04:00
Tyler Goodlet	d2ac9ecf95	Resolve `test_cancel_while_childs_child_in_sync_sleep` Was failing due to the `.fail_after()` timeout being too short and somehow the new interplay of that with strict-exception groups resulting in the `TooSlowError` never raising but instead an eg with the embedded `AssertionError`?? I still don't really get it honestly.. I've written up lengthy notes around the different `delay` settings that can be used to see the diff outcomes, the failing case being the one i still don't really grok and think is justification for `trio` to bubble inner `Cancelled`s differently possibly? For now i've included the original failing case as an `xfail` parametrization for now which will hopefully drive a follow lowlevel `trio` test in `test_trioisms`!	2025-08-18 10:46:37 -04:00
Tyler Goodlet	dcb1062bb8	Fix cluster suite, chng to new `gather_contexts()` Namely `test_empty_mngrs_input_raises()` was failing due to lazy-iterator use as input to `mngrs` which i guess i added support for a while back (by it doing a `list(mngrs)` internally)? So just change it to `gather_contexts(mngrs=())` and also tweak the `trio.fail_after(3)` since it appears that the prior 1sec was causing too-fast-of-a-cancellation (before the cluster fully spawned) and thus the expected `ValueError` never to show.. Also, mask the `tractor.trionics.collapse_eg()` usage (again?) in `open_actor_cluster()` since it seems unnecessary.	2025-08-18 10:46:37 -04:00
Tyler Goodlet	05d865c0f1	WIP tinkering with strict-eg-tns and cluster API Seems that the way the actor-nursery interacts with the `.trionics.gather_contexts()` API on cancellation makes our `.trionics.collapse_eg()` not work as intended? I need to dig into how `ActorNursery.cancel()` and `.__aexit__()` might be causing this discrepancy.. Consider this a commit-of-my-index type save for rn.	2025-08-18 10:46:37 -04:00
Tyler Goodlet	8218f0f51f	Bit of multi-line styling / name tweaks in cancellation suites	2025-08-18 10:46:37 -04:00
Tyler Goodlet	f776c47cb4	Drop msging-err patt from `subactor_breakpoint` ex Since the `bdb` module was added to the namespace lookup set in `._exceptions.get_err_type()` we can now relay a RAE-boxed `bdb.BdbQuit`.	2025-08-18 10:46:37 -04:00
Tyler Goodlet	0ca3d50602	Use `._supervise._shutdown_msg` in tooling test	2025-08-15 16:29:05 -04:00
Tyler Goodlet	547cf5a210	Drop stale comment from inter-peer suite	2025-07-18 00:35:35 -04:00
Tyler Goodlet	4569d11052	Move `.is_multi_cancelled()` to `.trioniics._beg` Since it's for beg filtering, the current impl should be renamed anyway; it's not just for filtering cancelled excs. Deats, - added a real doc string, links to official eg docs and fixed the return typing. - adjust all internal imports to match.	2025-07-16 15:49:18 -04:00
Tyler Goodlet	35977dcebb	Adjust ep-masking-suite for the real-use-case Namely that the more common-and-pertinent case is when a `@context`-ep-fn contains the `finally`-footgun but without a surrounding embedded `tn` (which currently still requires its own scope embedded `trionics.maybe_raise_from_masking_exc()`) which can't be compensated-for by `._rpc._invoke()` easily. Instead the test is composed where the `._invoke()`-internal `tn` is the machinery being addressed in terms of masking user-code excs with `trio.Cancelled`. Deats, - rename the test -> `test_unmasked_remote_exc` to reflect what the runtime should actually be addressing/solving. - drop the embedded `tn` from `sleep_n_chkpt_in_finally()` (for now) since that case can't currently easily be addressed without the user code using its own `trionics.maybe_raise_from_masking_exc()` inside the nursery scope. - as such drop all `tn` related params/logic/usage from the ep. - add in a `Cancelled` handler block which checks for RTE masking and always prints the occurrence loudly. Follow up, - obvi this suite will currently fail until the appropriate adjustment is made to `._rpc._invoke()` to do the unmasking; coming next. - we probably still need a case with an embedded user `tn` where if the default strict-eg mode is used then a ctxc from the parent might cause a non-graceful `Context.cancel()` outcome? \|_since the embedded user-`tn` will raise `ExceptionGroup[trio.Cancelled]` upward despite the parent nursery's scope being the canceller, or will a `collapse_eg()` inside the `._invoke()` scope handle this as well?	2025-07-15 07:23:21 -04:00
Tyler Goodlet	63c5b7696a	Mv `maybe_raise_from_masking_exc()` to `.trionics` Factor the `@acm`-closure it out of the `test_trioisms::test_acm_embedded_nursery_propagates_enter_err` suite for real use internally.	2025-07-15 07:23:21 -04:00
Tyler Goodlet	5f94f52226	Add ctx-ep suite for `trio`'s finally-footgun Deats are documented within, but basically a subtlety we already track with `trio`'s masking of excs by a checkpoint-in-`finally` can cause compounded issues with our `@context` endpoints, mostly in terms of remote error and cancel-ack relay semantics.	2025-07-15 07:23:21 -04:00
Tyler Goodlet	9ff448faa3	Add `open_crash_handler()` / `repl_fixture` suite Nicely nailing 2 birds by leveraging the new `repl_fixture` support to actually avoid use of a `pexpect`-style test B) Functionality audit summary, - ensures `open_crash_handler() as bxerr:` adheres to, - `raise_on_exit` semantics including only raising from a list of exc-types, - `repl_fixture` setup/teardown invocation and that `yield False` blocks REPL interaction, - delivering a `BoxedMaybeException` with the correct state set post crash. - all the above outside the actor-runtime existing. Also luckily enough, this seems to have found a bug for which a fix is coming right up!	2025-07-14 17:55:18 -04:00
Tyler Goodlet	760b9890c4	Add `debugging/subactor_bp_in_ctx.py` test set It's been in the debug scripts quite a while without a wrapping test and will be, - only the 2nd such REPL test which uses a lower-level `@context` ep-API - the first official and explicit use of `enable_transports=['uds']` a suite. Deats, - flip to 'uds' tpt and 'devx' level logging in the script. - add a new 2-case suite `test_ctxep_pauses_n_maybe_ipc_breaks` which validates both the quit-early (via `BdbQuit`) and channel-dropped-need-to-ctlc cases from a single test fn.	2025-07-14 13:15:07 -04:00
Tyler Goodlet	bbd2ea3e4f	Prevent `test_breakpoint_hook_restored` subproc hangs If the underlying example script fails (say due to a console output pattern-mismatch, `AssertionError`) the `pexpect` managed subproc with a `debug_mode=True` crash-handling-REPL engaged will ofc not terminate due to any SIGINT sent by the test harnesss (since we shield from it as part of normal sub-actor debugger operation). So instead always send a 'continue' cmd to the active `PdbREPL`'s stdin so it deactivates and allows the py-script-process to raise and terminate, unblocking the `pexpect.spawn`'s internal subproc joiner (which would otherwise hang without manual intervention, blocking downstream tests..). Also, use the new `PexpectSpawner` type alias after actually importing future annots.. XD	2025-07-14 00:00:13 -04:00
Tyler Goodlet	6b903f7746	Type alias our `pexpect.spawn()` closure fixture Such that we can more easily annotate any consumer test's of our `.tests.devx.conftest.spawn()` fixture which delivers a closure which, when called in a test fn body, transitively sub-invokes: `pytest.Pytester.spawn()` -> `pexpect.spawn()` IMO Expecting `Callable[[str], pexpect.pty_spawn.spawn]]` to be used all over is a bit too.. verbose?	2025-07-14 00:00:13 -04:00
Tyler Goodlet	2280bad135	Type annot the `testdir` fixture	2025-07-14 00:00:13 -04:00
Tyler Goodlet	1c6660c497	Mk `.devx._debug` a sub-pkg `.devx.debug` With plans for much factoring of the original module into sub-mods! Adjust all imports and refs throughout to match.	2025-07-14 00:00:12 -04:00
Tyler Goodlet	3aee702733	Add a `debug_mode`-state reversion test	2025-07-14 00:00:12 -04:00
Tyler Goodlet	37f843a128	Add an `enable_transports` test-suite Like it sounds, verifying that when that param is passed to the runtime startup eps (`.open_root_actor()/.open_nursery()`), the appropriate tpt-protocol is deployed for IPC (both the server and bound endpoints) in both the root and any sub-actors (as passed down from rent to child via the `.msg.types.SpawnSpec`).	2025-07-13 15:26:37 -04:00
Tyler Goodlet	29cd2ddbac	Drop 'IPC' prefix from `._server` types We already have the `.ipc` sub-pkg name so it seems a bit redundant/noisy for a namespace path Bp Leave an alias for the `Server` rn since it's already used in a few other internal mods.. will likely rename later if everyone is cool with it..	2025-07-13 15:26:37 -04:00
Tyler Goodlet	295b06511b	Plugin-ize some re-usable `conftest` parts Namely any CLI driven runtime-config fixtures such as, - `--spawn-backend` and `start_method`, - `--tpdb` and `debug_mode`, - `--tpt-proto` and `tpt_protos`/`tpt_proto`, - `reg_addr` as driven by the above. This moves all fixtures and necessary hook funcs (CLI parsing, configuring and test-gen) to the `._testing.pytest` module and thus allows any dependent project to leverage these fixtures in their own test suites after pointing to that plugin mod using, ```python # conftest.py pytest_plugins: tuple[str] = ( "tractor._testing.pytest", ) ``` Also, add a new `._testing.addr` helper mod which now contains a factored `get_rando_addr()` helper for creating test-sesh unique tpt-specific registry (or other) IPC endpoint addrs.	2025-07-13 15:26:37 -04:00
Tyler Goodlet	1e6b5b3f0a	Start a very basic ipc-server unit test suite For now it just boots a server, parametrized over all tpt-protos, sin any actor runtime bootup. Obvi the future todo is ensuring it all works with a client connecting via the equivalent lowlevel `.ipc._chan._connect_chan()` API(s).	2025-07-13 15:26:37 -04:00
Tyler Goodlet	36ddb85197	Fix assert on `.devx.maybe_open_crash_handler()` delivered `bxerr`	2025-07-13 15:26:37 -04:00
Tyler Goodlet	d6b0ddecd7	Improve bit of tooling for `test_resource_cache.py` Namely while what I was actually trying to solve was why `TransportClosed` was getting raised from `Portal.cancel_actor()` but still useful edge case auditing either way. Also opts into the `debug_mode` fixture with apprope timeout adjustment B)	2025-07-13 15:26:37 -04:00
Tyler Goodlet	9e5475391c	Set `_state._def_tpt_proto` in `tpt_proto` fixture Such that the global test-session always (and only) runs against the CLI specified `--tpt-proto=` transport protocol.	2025-07-13 15:26:37 -04:00
Tyler Goodlet	ef7ed7ac6f	Handle unconsidered fault-edge cases for UDS In `tests/test_advanced_faults.py` that is. Since instead of zero-responses like we'd expect from a network-socket we actually can get a few differences from the OS when "everything IPC is known" XD Namely it's about underlying `trio` exceptions versus how we wrap them and how we expect to box them. A `TransportClosed` boxing improvement is coming in follow up btw to make this all work! B)	2025-07-13 15:26:37 -04:00
Tyler Goodlet	d8094f4420	Woops, ensure we use `global` before setting `daemon()` fixture spawn delay..	2025-07-13 15:26:37 -04:00
Tyler Goodlet	d7b12735a8	Support multiple IPC transports in test harness! Via a new accumulative `--tpt-proto` arg you can select which `tpt_protos: list[str]`-fixture protocol keys will be delivered to opting in tests! B) Also includes, - CLI quote handling/stripping. - default of 'tcp'. - only support one selection per session at the moment (until we figure out how we want to support multiples, either simultaneously or sequentially). - draft a (masked) dynamic-`metafunc` parametrization in the `pytest_generate_tests()` hook. - first proven and working use in the `test_advanced_faults`-suite (and thus its underlying `examples/advanced_faults/ipc_failure_during_stream.py` script)! \|_ actually needed this to prove that the suite only has 2 failures on 'uds' seemingly due to low-level `trio` error semantics translation differences to do with with calling `socket.close()`.. On a very nearly related topic, - draft an (also commented out) `set_script_runtime_args()` fixture idea for a std way of `partial`-ling in runtime args to `examples/` scripts-as-modules defining a `main()` which would proxy to `tractor.open_nursery()`.	2025-07-13 15:26:37 -04:00
Tyler Goodlet	47107e44ed	Start protoyping multi-transport testing Such that we can run (opting-in) tests on both TCP and UDS backends and ensure the `reg_addr` fixture and various timeouts are adjusted accordingly. Impl deats, - add a new `tpc_proto` CLI option and fixture to allow choosing which "transport protocol" will be used in the test suites (either globally or contextually). - rm `_reg_addr` instead opting for a `_rando_port` which will only be used for `reg_addr`s which are net-tpt-protos. - rejig `reg_addr` fixture to set a ideally session-unique `testrun_reg_addr` based on the `tpt_proto` setting making appropriate calls to `._addr` APIs as needed. - refine `daemon` fixture a bit with typing, `tpt_proto` timings, and stderr capture. - in `test_discovery` do a ton of type-annots, add `debug_mode` fixture opt ins, augment `spawn_and_check_registry()` with `psutil.Process` passing for introspection (when things go wrong..).	2025-07-13 15:26:37 -04:00
Tyler Goodlet	f67b0639b8	Move peer-tracking attrs from `Actor` -> `IPCServer` Namely transferring the `Actor` peer-`Channel` tracking attrs, - `._peers` which maps the uids to client channels (with duplicates apparently..) - the `._peer_connected: dict[tuple[str, str], trio.Event]` child-peer syncing table mostly used by parent actors to wait on sub's to connect back during spawn. - the `._no_more_peers = trio.Event()` level triggered state signal. Further we move over with some minor reworks, - `.wait_for_peer()` verbatim (adjusting all dependants). - factor the no-more-peers shielded wait branch-block out of the end of `async_main()` into 2 new server meths, * `.has_peers()` with optional chan-connected checking flag. * `.wait_for_no_more_peers()` which just does the maybe-shielded `._no_more_peers.wait()`	2025-07-08 18:05:05 -04:00
Tyler Goodlet	61df10b333	Move concrete `Address`es to each tpt module That is moving from `._addr`, - `TCPAddress` to `.ipc._tcp` - `UDSAddress` to `.ipc._uds` Obviously this requires adjusting a buncha stuff in `._addr` to avoid import cycles (the original reason the module was not also included in the new `.ipc` subpkg) including, - avoiding "unnecessary" imports of `[Unwrapped]Address` in various modules. * since `Address` is a protocol and the main point is that it does not need to be inherited per (https://typing.python.org/en/latest/spec/protocol.html#terminology) thus I removed the need for it in both transport submods. * and `UnwrappedAddress` is a type alias for tuples.. so we don't really always need to be importing it since it also kinda obfuscates what the underlying pairs are. - not exporting everything in submods at the `.ipc` top level and importing from specific submods by default. - only importing various types under a `if typing.TYPE_CHECKING:` guard as needed.	2025-07-08 18:05:05 -04:00
Tyler Goodlet	ba45c03e14	Skip the ringbuf test mod for now since data-gen is a bit "heavy/laggy" atm	2025-07-08 18:05:05 -04:00
Tyler Goodlet	708ce4a051	Repair weird spawn test, start `test_root_runtime` There was a very strange legacy test `test_spawning.test_local_arbiter_subactor_global_state` which was causing unforseen hangs/errors on the UDS tpt and looking deeper this test was already doing root-actor things that should never have been valid XD So rework that test to properly demonstrate something of value (i guess..) and add a new suite which start more rigorously auditing our `open_root_actor()` permitted usage. For the old test, - since the main point of this test seemed to be the ability to invoke the same function in both the parent and child actor (using the very legacy `ActorNursery.run_in_actor()`.. due to be deprecated) rename it to `test_run_in_actor_same_func_in_child`, - don't re-enter `.open_root_actor()` since that's invalid usage (tested in new suite see below), - adjust some `spawn()` arg/var naming and ensure we only return in the child. For the new suite add tests for, - ensuring the implicit `open_root_actor()` call under `open_nursery()`. - double open of `open_root_actor()` from within the same process tree both from a root and sub. Intro some new `_exceptions` used in the new suite, - a top level `RuntimeFailure` for generically expressing faults not of our own doing that prevent successful operation; this is what we now (changed in this commit) raise on attempts to open a 2nd root. - mk `ActorFailure` derive from the former; it's already used from `._spawn` when subprocs fail to boot.	2025-07-08 18:05:04 -04:00
Guillermo Rodriguez	f67e19a852	Trying to make full suite pass with uds	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	0be9f5f907	Finally switch to using address protocol in all runtime	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	26fef82d33	Add buf_size to RBToken and add sender cancel test, move disable_mantracker to its own _mp_bs module	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	84d25b5727	Make ring buf api use pickle-able RBToken	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	2dd3a682c8	Handle cancelation on EventFD.read	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	af69272d16	Move linux specifics from tractor.ipc._shm into tractor.ipc._linux	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	8e3f581d3f	Move tractor._shm to tractor.ipc._shm	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	9921ea3cae	General improvements EventFD class now expects the fd to already be init with open_eventfd RingBuff Sender and Receiver fully manage SharedMemory and EventFD lifecycles, no aditional ctx mngrs needed Separate ring buf tests into its own test bed Add parametrization to test and cancellation Add docstrings Add simple testing data gen module .samples	2025-07-08 12:57:28 -04:00
Guillermo Rodriguez	414a8c5b75	IPC ring bug impl with async read	2025-07-08 12:57:28 -04:00
Tyler Goodlet	8ebb1f09de	Pass `str` dtype for `use_str` case	2025-03-27 17:54:04 -04:00
Tyler Goodlet	9a0d529b18	Parametrize rw test with variable frame sizes Demonstrates fixed size frame-oriented reads by the child where the parent only transmits a "read" stream msg on "frame fill events" such that the child incrementally reads the shm list data (much like in a real-time-buffered streaming system).	2025-03-27 17:54:04 -04:00
Tyler Goodlet	c932bb5911	Add repetitive attach to existing segment test	2025-03-27 17:54:04 -04:00
Tyler Goodlet	33482d8f41	Add initial readers-writer shm list tests	2025-03-27 17:54:04 -04:00
Tyler Goodlet	8ba315e60c	Rename ext-types with `msgspec` suite module	2025-03-27 15:58:03 -04:00
Tyler Goodlet	80f20b35b1	Complete rename to parent->child IPC ctx peers Now changed in all comments docs and test-code content such that we aren't using the "caller"->"callee" semantics anymore.	2025-03-27 15:58:02 -04:00
Tyler Goodlet	9be76b1dda	Extend ctx semantics suite for streaming edge cases! Muchas grax to @guilledk for finding the first issue which kicked of this further scrutiny of the `tractor.Context` and `MsgStream` semantics test suite with a strange edge case where, - if the parent opened and immediately closed a stream while the remote child task started and continued (without terminating) to send msgs the parent's `open_context().__aexit__()` would not block on the child to complete! => this was seemingly due to a bug discovered inside the `.msg._ops.drain_to_final_msg()` stream handling case logic where we are NOT checking if `Context._stream` is non-`None`! As such this, - extends the `test_caller_closes_ctx_after_callee_opens_stream` (now renamed, see below) to include cases for all combinations of the child and parent sending before receiving on the stream as well as all placements of `Context.cancel()` in the parent before, around and after the stream open. - uses the new `expect_ctxc()` for expecting the taskc (`trio.Task` cancelled)` cases. - also extends the `test_callee_closes_ctx_after_stream_open` (also renamed) to include the case where the parent sends a msg before it receives. => this case has unveiled yet-another-bug where somehow the underlying `MsgStream._rx_chan: trio.ReceiveMemoryChannel` is allowing the child's `Return[None]` msg be consumed and NOT in a place where it is correctly set as `Context._result` resulting in the parent hanging forever inside `._ops.drain_to_final_msg()`.. Alongside, - start renaming using the new "remote-task-peer-side" semantics throughout the test module: "caller" -> "parent", "callee" -> "child".	2025-03-27 15:58:02 -04:00
Tyler Goodlet	a810f6c8f6	Facepalm, fix logic misstep on child side Namely that `add_hooks: bool` should be the same as on the rent side.. Also, just drop the now unused `iter_maybe_sends`. This makes the suite entire greeeeen btw, including the new sub-suite which i hadn't runt before Bo	2025-03-27 15:58:02 -04:00
Tyler Goodlet	83b9dc3c62	Rework IPC-using `test_caps_basesd_msging` tests Namely renaming and massively simplifying it to a new `test_ext_types_over_ipc` which avoids all the wacky "parent dictates what sender should be able to send beforehand".. Instead keep it simple and just always try to send the same small set of types over the wire with expect-logic to handle each case, - use the new `dec_hook`/`ext_types` args to `mk_[co]dec()` routines for pld-spec ipc transport. - always try to stream a small set of types from the child with logic to handle the cases expected to error. Other, - draft a `test_pld_limiting_usage` to check runtime raising of bad API usage; haven't run it yet tho. - move `test_custom_extension_types` to top of mod so that the `enc/dec_nsp()` hooks can be reffed from test parametrizations. - comment out (and maybe remove) the old routines for `iter_maybe_sends`, `test_limit_msgspec`, `chk_pld_type`. XXX TODO, turns out the 2 failing cases from this suite have exposed an an actual bug with `MsgTypeError` unpacking where the `ipc_msg=` input is being set to `None` ?? -> see the comment at the bottom of `._exceptions._mk_recv_mte()` which seems to describe the likely culprit?	2025-03-27 15:58:02 -04:00
Tyler Goodlet	d4e6f2b8dc	Move `Union` serializers to new `msg.` mod Namely moving `enc/dec_type_union()` from the test mod to a new `tractor.msg._exts` for general use outside the test suite.	2025-03-27 15:58:02 -04:00
Tyler Goodlet	64fe767647	Finally get type-extended `msgspec` fields workinn By using our new `PldRx` design we can, - pass through the pld-spec & a `dec_hook()` to our `MsgDec` which is used to configure the underlying `.dec: msgspec.msgpack.Decoder` - pass through a `enc_hook()` to `mk_codec()` and use it to conf the equiv `MsgCodec.enc` such that sent msg-plds are converted prior to transport. The trick ended up being just to always union the `mk_dec()` extension-types spec with the normaly with the `msgspec.Raw` pld-spec such that the `dec_hook()` is only invoked for payload types tagged by the encoder/sender side B) A variety of impl tweaks to make it all happen as well as various cleanups in the `.msg._codec` mod include, - `mk_dec()` no defaul `spec` arg, better doc string, accept the new `ext_types` arg, doing the union of that with `msgspec.Raw`. - proto-ed a now unused `mk_boxed_ext_struct()` which will likely get removed since it ended up that our `PayloadMsg` structs already cover the ext-type-hook requirement that the decoder is passed a `.type=msgspec.Struct` of some sort in order for `.dec_hook` to be used. - add a `unpack_spec_types()` util fn for getting the `set[Type]` from from a `Union[Type]` annotation instance. - mk the default `mk_codec(pc_pld_spec = Raw,)` since the `PldRx` design was already passing/overriding it and it doesn't make much sense to use `Any` anymore for the same reason; it will cause various `Context` apis to now break. \|_ also accept a `enc_hook()` and `ext_types` which are used to maybe config the `.msgpack.Encoder` - generally tweak a bunch of comments-as-docs and todos namely the ones that are completed after the pld-rx design was implemented. Also, - mask the non-functioning `'defstruct'` approach `inside `.msg.types.mk_msg_spec()` to prep for its removal. Adjust the test suite (rn called `test_caps_based_msging`), - add a new suite `test_custom_extension_types` and move and use the `enc/dec_nsp()` hooks to the mod level for its use. - prolly planning to drop the `test_limit_msgspec` suite since it's mostly replaced by the `test_pldrx_limiting` mod's version? - originally was tweaking a bunch in `test_codec_hooks_mod` but likely it will get mostly rewritten to be simpler and simply verify that ext-typed fields can be used over IPC `Context`s between actors (as originally intended for this sub-suite).	2025-03-27 15:58:02 -04:00
Tyler Goodlet	a26f817ed1	Another loosie in the trioisms suite	2025-03-27 13:38:47 -04:00
Tyler Goodlet	e815dcd3c8	Use `collapse_eg()` in broadcaster suite Around the test embedded `trio.open_nursery()` calls as expected. Also tidy up the various nursery var names.	2025-03-27 13:38:47 -04:00
Tyler Goodlet	3ad558230a	Fix docs tests with yet another loosie-goosie So the KBI propagates up to the actor nursery scope and also avoid running any `examples/multihost/` subdir scripts.	2025-03-27 13:38:47 -04:00
Tyler Goodlet	22f405a707	Another couple loose-ifies for discovery and advanced fault suites	2025-03-27 13:38:47 -04:00
Tyler Goodlet	e5bcefb575	Add (masked) meta-debug-fixture for determining if `debug_mode` is set in harness..	2025-03-27 13:38:47 -04:00
Tyler Goodlet	8f7c022afe	Various test tweaks related to 3.13 egs Including changes like, - loose eg flagging in various test emedded `trio.open_nursery()`s. - changes to eg handling (like using `except*`). - added `debug_mode` integration to tests that needed some REPLin in order to figure out appropriate updates.	2025-03-27 13:38:47 -04:00
Tyler Goodlet	8f774f52b1	Another loose-egs flag in `test_child_manages_service_nursery`	2025-03-27 13:38:47 -04:00
Tyler Goodlet	b021772a1e	Mask ctlc borked REPL tests Namely the `tractor.pause_from_sync()` examples using both bg threads and `asyncio` which seem to go into bad states where SIGINT is ignored.. Deats, - add `maybe_expect_timeout()` cm to ensure the EOF hangs get `.xfail()`ed instead. - @pytest.mark.ctlcs_bish` `test_pause_from_sync` and don't expect the greenback prompt msg. - also mark `test_sync_pause_from_aio_task`.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	03406e020c	Repair/update `stackscope` test Seems that on 3.13 it's not showing our script code in the output now? Gotta get an example for @oremanj to see what's up but really it'd be nice to just custom format stuff above `trio`'s runtime by def.. Anyway, update the `.devx._stackscope`, - log formatting to be a little more "sclangy" lookin. - change the per-actor "delimiter" lines style. - report the `signal.getsignal(SIGINT)` which i needed in the `sync_bp.py` with ctl-c causing a hang.. - mask the `_tree_dumped` duplicator log report as well as the "dumped fine" one. - add an example `pkill --signal SIGUSR1` cmdline. Tweak the test to cope with, - not showing our script lines now.. which i've commented in the `assert_before()` patts.. - to expect the newly formatted delimiter (ascii) lines to separate the root vs. hanger sub-actor sections.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	b0acc9ffe8	Add a mark to `pytest.xfail()` questionably conc py stuff (ur mam `.xfail()`s bish!)	2025-03-27 13:24:25 -04:00
Tyler Goodlet	fc325a621b	Be extra sure to re-raise EoCs from translator That is whenever `trio.EndOfChannel` is raised (presumably from the `._to_trio.receive()` call inside `LinkedTaskChannel.receive()`) we need to be extra certain that we let it bubble upward transparently DESPITE special exc-as-signal handling that is normally suppressed from the aio side; REPEAT we want to ALWAYS bubble any `trio_err == trio.EndOfChannel` in the `finally:` handler of `translate_aio_errors()` despite `chan._trio_to_raise == AsyncioTaskExited` such that the caller's iterable machinery will operate as normal when the inter-task stream is stopped (again, presumably by the aio side task terminating the inter-task stream). Main impl deats for this, - in the EoC handler block ensure we assign both `chan._trio_err` and the local `trio_err` as well as continue to re-raise. - add a case to the match block in the `finally:` handler which FOR SURE re-raises any `type(trio_err) is EndOfChannel`! Additionally fix a bad bug, - a ref bug where we were NOT using the `except BaseException as _trio_err` to assign to `chan._trio_err` (by accident was missing the leading `_`..) Unrelated impl tweak, - move all `maybe_raise_aio_side_err()` content back to inline with its parent func - makes it easier to use `tractor.pause()` mostly Bp - go back to trying to use `aio_task.set_exception(aio_taskc)` for now even though i'm pretty sure we're going to move to a try-fute-first style helper for this in the future. Adjust some tests to match/mk-them-green, - break from `aio_echo_server()` recv loop on `to_asyncio.TrioTaskExited` much like how you'd expect to (implicitly with a `for`) with a `trio.EndOfChannel`. - toss in a masked `value is None` pause point i needed for debugging inf looping caused by not re-raising EoCs per the main patch description. - add a debug-mode sized delay to root-infected test.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	d5ba9be3a9	More `debug_mode` test support, better nursery var names	2025-03-27 13:24:25 -04:00
Tyler Goodlet	639186aa37	Add per-side graceful-exit/cancel excs-as-signals Such that any combination of task terminations/exits can be explicitly handled and "dual side independent" crash cases re-raised in egs. The main error-or-exit impl changes include, - use of new per-side "signaling exceptions": - TrioTaskExited\|TrioCancelled for signalling aio. - AsyncioTaskExited\|AsyncioCancelled for signalling trio. - NOT overloading the `LinkedTaskChannel._trio/aio_err` fields for err-as-signal relay and instead add a new pair of `._trio/aio_to_raise` maybe-exc-attrs which allow each side's task to specify what it would want the other side to raise to signal its/a termination outcome: - `._trio_to_raise: AsyncioTaskExited\|AsyncioCancelled` to signal, \|_ the aio task having returned while the trio side was still reading from the `asyncio.Queue` or is just not `.done()`. \|_ the aio task being self or trio-request cancelled where a `asyncio.CancelledError` is raised and caught but NOT relayed as is back to trio; instead signal a "more explicit" exc type. - `._aio_to_raise: TrioTaskExited\|TrioCancelled` to signal, \|_ the trio task having returned while the aio side was still reading from the mem chan and indicating that the trio side might not care any more about future streamed values (like the `Stop/EndOfChannel` equivs for ipc `Context`s). \|_ when the trio task canceld we do a `asyncio.Future.set_exception(TrioTaskExited())` to indicate to the aio side verbosely that it should cancel due to the trio parent. - `_aio/trio_err` are now left to only capturing the actual per-side task excs for introspection / other side's handling logic. - supporting "graceful exits" depending on API in use from `translate_aio_errors()` such that if either side exits but the other side isn't expect to consume the final `return`ed value, we just exit silently, which required: - adding a `suppress_graceful_exits: bool` flag. - adjusting the `maybe_raise_aio_side_err()` logic to use that flag and suppress only on certain combos of `._trio_to_raise/._trio_err`. - prefer to raise `._trio_to_raise` when the aio-side is the src and vice versa. - filling out pedantic logging for cancellation cases indicating which side is the cause. - add a `LinkedTaskChannel._aio_result` modelled after our `Context._result` a a similar `.wait_for_result()` interface which allows maybe accessing the aio task's final return value if desired when using the `open_channel_from()` API. - rename `cancel_trio()` done handler -> `signal_trio_when_done()` Also some fairly major test suite updates, - add a `delay: int` producing fixture which delivers a much larger timeout whenever `debug_mode` is set so that the REPL can be used without a surrounding cancel firing. - add a new `test_aio_exits_early_relays_AsyncioTaskExited` including a paired `exit_early: bool` flag to `push_from_aio_task()`. - adjust `test_trio_closes_early_causes_aio_checkpoint_raise` to expect a `to_asyncio.TrioTaskExited`.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	182218a776	Another `is` fix..	2025-03-27 13:24:25 -04:00
Tyler Goodlet	6de17a3949	Unset `$PYTHON_COLORS` for test debugger suite.. Since obvi all our `pexpect` patterns aren't going to match with a heck-ton of terminal color escape sequences in the output XD	2025-03-27 13:24:25 -04:00
Tyler Goodlet	41a3297b9f	Tweak some test asserts to better `is` style	2025-03-27 13:24:25 -04:00
Tyler Goodlet	255db4b127	Save an MIA `breakpoint()`-restore test from prior!? It appears that during the reorg commit `a356233b47` this was intended to be moved (presumably where i have here) to `test_tooling` but was somehow just never pasted over XD Good thing this was caught while going through the remaining TODO bullets in #2 !! Also includes fixed relative `.conftest` imports!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	66a7d660f6	Draft test-doc for "out-of-band" `asyncio.Task`.. Since there's no way to activate `greenback`'s portal in such cases, we should at least have a test verifying our very loud error about the inability to support this usage..	2025-03-27 13:24:25 -04:00
Tyler Goodlet	9b393338ca	Add a `tests/test_root_infect_asyncio` Might as well break apart the specific test set since there are some (minor) subtleties and the orig test mod is already getting pretty big XD Includes both the new "independent"-event-loops test as well as the std usage base case suite.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	4edf36a895	Impl a proto "unmasker" `@acm` alongside our test Such that the suite verifies the wip `maybe_raise_from_masking_exc()` will raise from a `trio.Cancelled.__context__` since I can't think of any reason a `Cancelled` should ever be raised in-place of a non-`Cancelled` XD Not sure what should be raised instead (or maybe just a `log.warning()` emitted?) but this starts a draft for refinement at the least. Use the new `@pytest.mark.parametrize` explicit tuple-of-params form with an `pytest.param + `.mark.xfail()` for the default behaviour case.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	bfd1864180	Add a "raise-from-`finally:`" example test Since i wasted 2 days just to find an example of this inside an `@acm`, figured I better reproduce for the purposes of maybe implementing a warning sys (inside our wip proto `open_taskman()`) when a nursery detects a single `Cancelled` in an eg where the `.__context__` is set to some non-cancel error (which likely means a cancel-causing source exception was suppressed by accident). Left in a buncha commented code using `maybe_open_nursery()` which i thought might be part of the issue but didn't end up being required; will likely remove on a follow up refinement.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	3c8b1aa888	Add an inter-leaved-task error test Trying to replicate cases where errors are raised in both `trio` and `asyncio` tasks independently (at least in `.to_asyncio` API terms) with a new `test_trio_prestarted_task_bubbles` that generates 3 cases inside a `@acm` calls stack composing a `trio.Nursery` with a `to_asyncio.open_channel_from()` call where a set of `trio` tasks are started in a loop using `.start()` with various exc raising sequences, - the aio task raising before the last `trio` task spawns. - the aio task raising just after the last trio task spawns, but before it starts. - after the last trio task `.start()` call returns control to the parent - but (for now) did not error. TODO, still more cases to discover as i'm still fighting a `modden` bug of this sort atm.. Other, - tweak some other tests to have timeouts since some recent hangs were found.. - started mucking with py3.13 and thus adjustments for strict egs in some tests; full patchset to test suite likely coming soon!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	a283d8c05a	Support and test infected-`asyncio`-mode for root Such that you can use, ```python tractor.to_asyncio.run_as_asyncio_guest( trio_main=_trio_main, ) ``` to boostrap the root actor (and thus main parent process) to embed the actor-rumtime into an `asyncio` loop. Prove it all works with an subactor-free version of the aio echo-server test suite B)	2025-03-27 13:24:25 -04:00
Tyler Goodlet	a58c1cad91	Change `tractor.breakpoint()` to new `.pause()` in test suite	2025-03-27 13:24:25 -04:00
Tyler Goodlet	e1d96099fc	Wrap `asyncio_bp.py` ex into test suite Ensuring we can at least use `breakpoint()` from an infected actor's `asyncio.Task` spawned via a `.to_asyncio` API. Also includes a little `tests/devx/` reorging, - start splitting out non-`tractor.pause()` tests into a new `test_pause_from_non_trio.py` for all the `.pause_from_sync()` use in bg-threaded or `asyncio` applications. - factor harness commonalities to the `devx/conftest` (namely the `do_ctlc()` masher). - mv `test_pause_from_sync` to the new non`-trio` mod. NOTE, the `ctlc=True` is still failing for `test_pause_from_asyncio_task` which is a user-happiness bug but not anything fundamentally broken - just need to handle the `asyncio` case in `.devx._debug.sigint_shield()`!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	ccd60b0c6e	Add `breakpoint()` hook restoration example + test	2025-03-27 13:24:25 -04:00
Tyler Goodlet	00d1c8ea29	Fix multi-daemon debug test `break` signal.. It was expecting `AssertionError` as a proceed-in-test signal (by breaking from a continue loop), but `in_prompt_msg(raise_on_err=True)` was changed to raise `ValueError`; so instead just use as a predicate for the `break`. Also rework `in_prompt_msg()` to accept the `child: BaseSpawn` as input instead of `before: str` remove the casting boilerplate, and adjust all usage to match.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	5cdfee3bcf	Pass `infect_asyncio` setting via runtime-vars The reason for this "duplication" with the `--asyncio` CLI flag (passed to the child during spawn) is 2-fold: - allows verifying inside `Actor._from_parent()` that the `trio` runtime was started via `.start_guest_run()` as well as if the `Actor._infected_aio` spawn-entrypoint value has been set (by the `._entry.<spawn-backend>_main()` whenever `--asyncio` is passed) such that any mismatch can be signaled via an `InternalError`. - enables checking the `._state._runtime_vars['_is_infected_aio']` value directly (say from a non-actor/`trio`-thread) instead of calling `._state.current_actor(err_on_no_runtime=False)` in certain edge cases. Impl/testing deats: - add `._state._runtime_vars['_is_infected_aio'] = False` default. - raise `InternalError` on any `--asyncio`-flag-passed vs. `_runtime_vars`-value-relayed-from-parent inside `Actor._from_parent()` and include a `Runner.is_guest` assert for good measure B) - set and relay `infect_asyncio: bool` via runtime-vars to child in `ActorNursery.start_actor()`. - verify `actor.is_infected_aio()`, `actor._infected_aio` and `_state._runtime_vars['_is_infected_aio']` are all set in test suite's `asyncio_actor()` endpoint.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	64d506970a	Officially test proto-ed `stackscope` integration By re-purposing our `pexpect`-based console matching with a new `debugging/shield_hang_in_sub.py` example, this tests a few "hanging actor" conditions more formally: - that despite a hanging actor's task we can dump a `stackscope.extract()` tree on relay of `SIGUSR1`. - the actor tree will terminate despite a shielded forever-sleep by our "T-800" zombie reaper machinery activating and hard killing the underlying subprocess. Some test deats: - simulates the expect actions of a real user by manually using `os.kill()` to send both signals to the actor-tree program. - `pexpect`-matches against `log.devx()` emissions under normal `debug_mode == True` usage. - ensure we get the actual "T-800 deployed" `log.error()` msg and that the actor tree eventually terminates! Surrounding (re-org/impl/test-suite) changes: - allow disabling usage via a `maybe_enable_greenback: bool` to `open_root_actor()` but enable by def. - pretty up the actual `.devx()` content from `.devx._stackscope` including be extra pedantic about the conc-primitives for each signal event. - try to avoid double handles of `SIGUSR1` even though it seems the original (what i thought was a) problem was actually just double logging in the handler.. \|_ avoid double applying the handler func via `signal.signal()`, \|_ use a global to avoid double handle func calls and, \|_ a `threading.RLock` around handling. - move common fixtures and helper routines from `test_debugger` to `tests/devx/conftest.py` and import them for use in both test mods.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	de7b114303	Start a new `tests/devx/` tooling-subsuite-pkg	2025-03-27 13:24:25 -04:00
Tyler Goodlet	f195c5ec47	Move `mk_cmd()` to `._testing` Since we're going to need it more generally for `.devx` sub-sys tooling tests. Also, up the sync-pause ctl-c delay another 10ms..	2025-03-27 13:24:25 -04:00
Tyler Goodlet	92713af63e	Get multi-threaded sync-pausing fully workin! The final issue was making sure we do the same thing on ctl-c/SIGINT from the user. That is, if there's already a bg-thread in REPL, we `log.pdb()` about SIGINT shielding and re-draw the prompt; the same UX as normal actor-runtime-task behaviour. Reasons this wasn't workin.. and the fix: - `.pause_from_sync()` was overriding the local `repl` var with `None` delivered by (transitive) calls to `_pause(debug_func=None)`.. so remove all that and only assign it OAOO prior to thread-type case branching. - always call `DebugStatus.shield_sigint()` as needed from all requesting threads/tasks: - in `_pause_from_bg_root_thread()` BEFORE calling `._pause()` AND BEFORE yielding back to the bg-thread via `.started(out)` to ensure we're definitely overriding the handler in the `trio`-main-thread task before unblocking the requesting bg-thread. - from any requesting bg-thread in the root actor such that both its main-`trio`-thread scheduled task (as per above bullet) AND it are SIGINT shielded. - always call `.shield_sigint()` BEFORE any `greenback._await()` case don't entirely grok why yet, but it works)? - for `greenback._await()` case always set `bg_task` to the current one.. - tweaks to the `SIGINT` handler, now renamed `sigint_shield()` so as not to name-collide with the methods when editor-searching: - always try to `repr()` the REPL thread/task "owner" as well as the active `PdbREPL` instance. - add `.devx()` notes around the prompt flushing deats and comments for any root-actor-bg-thread edge cases. Related/supporting refinements: - add `get_lock()`/`get_debug_req()` factory funcs since the plan is to eventually implement both as `@singleton` instances per actor. - fix `acquire_debug_lock()`'s call-sig-bug for scheduling `request_root_stdio_lock()`.. - in `._pause()` only call `mk_pdb()` when `debug_func != None`. - add some todo/warning notes around the `cls.repl = None` in `DebugStatus.release()` `test_pause_from_sync()` tweaks: - don't use a `attach_patts.copy()`, since we always `break` on match. - do `pytest.fail()` on that ^ loop's fallthrough.. - pass `do_ctlc(child, patt=attach_key)` such that we always match the the current thread's name with the ctl-c triggered `.pdb()` emission. - oh yeah, return the last `before: str` from `do_ctlc()`. - in the script, flip `abandon_on_cancel=True` since when `False` it seems to cause `trio.run()` to hang on exit from the last bg-thread case?!?	2025-03-27 13:24:25 -04:00
Tyler Goodlet	b057a1681c	Todo a test for sync-pausing from non-main-root-tasks	2025-03-27 13:24:25 -04:00
Tyler Goodlet	53409f2942	Demo-abandonment on shielded `trio`-side work Finally this reproduces the issue as it (originally?) exhibited inside `piker` where the `Actor.lifetime_stack` wasn't closed in cases where during `infected_aio`-actor cancellation/shutdown `trio` side tasks which are doing shielded (teardown) work are NOT being watched/waited on from the `aio_main()` task-closure inside `run_as_asyncio_guest()`! This is then the root cause of the guest-run being abandoned since if our `aio_main()` task-closure doesn't know it should allow the run to finish, it's going to call `loop.close()` eventually resulting in the `GeneratorExit` thrown into `trio._core._run.unrolled_run()`.. So, this extends the `test_sigint_closes_lifetime_stack()` suite to include cases for such shielded `trio`-task ops: - add a new `trio_side_is_shielded: bool` which will toggle whether to add a shielded 0.5s `trio.sleep()` loop to `manage_file()` which should outlive the `asyncio` event-loop shutdown sequence and result in an abandoned guest-run and thus a leaked file. - parametrize the existing suite with this case resulting in a total 16 test set B) This patch demonstrates the problem with our `aio_main()` task-closure impl via the now 4 failing tests, a fix is coming in a follow up commit!	2025-03-27 13:24:25 -04:00
Tyler Goodlet	7f00921be1	Lel, revert `AsyncioCancelled` inherit, module.. Turns out it somehow breaks our `to_asyncio` error relay since obvi `asyncio`'s runtime seems to specially handle it (prolly via `isinstance()` ?) and it caused our `test_aio_cancelled_from_aio_causes_trio_cancelled()` to hang.. Further, obvi `unpack_error()` won't be able to find the type def if not kept inside `._exceptions`.. So given all that, revert the change/move as well as: - tweak the aio-from-aio cancel test to timeout. - do `trio.sleep()` conc with any bg aio task by moving out nursery block. - add a `send_sigint_to: str` parameter to `test_sigint_closes_lifetime_stack()` such that we test the SIGINT being relayed to just the parent or the child.	2025-03-27 13:24:25 -04:00
Tyler Goodlet	a9b3336318	Hack `asyncio` to not abandon a guest-mode run? Took me a while to figure out what the heck was going on but, turns out `asyncio` changed their SIGINT handling in 3.11 as per: https://docs.python.org/3/library/asyncio-runner.html#handling-keyboard-interruption I'm not entirely sure if it's the 3.11 changes or possibly wtv further updates were made in 3.12 but more or less due to the way our current main task was written the `trio` guest-run was getting abandoned on SIGINTs sent from the OS to the infected child proc.. Note that much of the bug and soln cases are layed out in very detailed comment-notes both in the new test and `run_as_asyncio_guest()`, right above the final "fix" lines. Add new `test_infected_aio.test_sigint_closes_lifetime_stack()` test suite which reliably triggers all abandonment issues with multiple cases of different parent behaviour post-sending-SIGINT-to-child: 1. briefly sleep then raise a KBI in the parent which was originally demonstrating the file leak not being cleaned up by `Actor.lifetime_stack.close()` and simulates a ctl-c from the console (relayed in tandem by the OS to the parent and child processes). 2. do `Context.wait_for_result()` on the child context which would hang and timeout since the actor runtime would never complete and thus never relay a `ContextCancelled`. 3. both with and without running a `asyncio` task in the `manage_file` child actor; originally it seemed that with an aio task scheduled in the child actor the guest-run abandonment always was the "loud" case where there seemed to be some actor teardown but with tbs from python failing to gracefully exit the `trio` runtime.. The (seemingly working) "fix" required 2 lines of code to be run inside a `asyncio.CancelledError` handler around the call to `await trio_done_fut`: - `Actor.cancel_soon()` which schedules the actor runtime to cancel on the next `trio` runner cycle and results in a "self cancellation" of the actor. - "pumping the `asyncio` event loop" with a non-0 `.sleep(0.1)` XD \|_ seems that a "shielded" pump with some actual `delay: float >= 0` did the trick to get `asyncio` to allow the `trio` runner/loop to fully complete its guest-run without abandonment. Other supporting changes: - move `._exceptions.AsyncioCancelled`, our renamed `asyncio.CancelledError` error-sub-type-wrapper, to `.to_asyncio` and make it derive from `CancelledError` so as to be sure when raised by our `asyncio` x-> `trio` exception relay machinery that `asyncio` is getting the specific type it expects during cancellation. - do "summary status" style logging in `run_as_asyncio_guest()` wherein we compile the eventual `startup_msg: str` emitted just before waiting on the `trio_done_fut`. - shield-wait with `out: Outcome = await asyncio.shield(trio_done_fut)` even though it seems to do nothing in the SIGINT handling case..(I presume it might help avoid abandonment in a `asyncio.Task.cancel()` case maybe?)	2025-03-27 13:24:25 -04:00
Tyler Goodlet	d1b4d4be52	Adjusts advanced fault tests to match new `TransportClosed` semantics	2025-03-24 14:04:52 -04:00
Tyler Goodlet	32f7742e53	Finally implement peer-lookup optimization.. There's a been a todo for soo long for this XD Since all `Actor`'s store a set of `._peers` we can try a lookup on that table as a shortcut before pinging the registry Bo Impl deats: - add a new `._discovery.get_peer_by_name()` routine which attempts the `._peers` lookup by combining a copy of that `dict` + an entry added for `Actor._parent_chan` (since all subs have a parent and often the desired contact is just that connection). - change `.find_actor()` (for the `only_first == True` case), `.query_actor()` and `.wait_for_actor()` to call the new helper and deliver appropriate outputs if possible. Other, - deprecate `get_arbiter()` def and all usage in tests and examples. - drop lingering use of `arbiter_sockaddr` arg to various routines. - tweak the `Actor` doc str as well as some code fmting and a tweak to the `._stream_handler()`'s initial `con_status: str` logging value since the way it was could never be reached.. oh and `.warning()` on any new connections which already have a `_pre_chan: Channel` entry in `._peers` so we can start minimizing IPC duplications.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	0332604044	(Re)type annot some tests - For the (still not finished) `test_caps_based_msging`, switch to using the new `PayloadMsg`. - add `testdir` fixture type.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	c7f153c266	Update `MsgTypeError` content matching to latest	2025-03-24 14:04:52 -04:00
Tyler Goodlet	89c2137fc9	Update pld-rx limiting test(s) to use deco input The tests only use one input spec (conveniently) so there's not much to change in the logic, - only pass the `maybe_msg_spec` to the child-side decorator and obvi drop the surrounding `msgops.limit_plds()` block in the child. - tweak a few `MsgDec` asserts, mostly dropping the `msg._ops._def_any_spec` state checks since the child-side won't have any pre pld-spec state given the runtime now applies the `pld_spec` before running the task's func body. - also allowed dropping the `finally:` which did a similar check outside the `.limit_plds()` block.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	f83e06d371	Use new `._debug._repl_fail_msg` inside `test_pause_from_sync`	2025-03-24 14:04:52 -04:00
Tyler Goodlet	2f1a97e73e	Catch `.pause_from_sync()` in root bg thread bugs! Originally discovered as while using `tractor.pause_from_sync()` from the `i3ipc` client running in a bg-thread that uses `asyncio` inside `modden`. Turns out we definitely aren't correctly handling `.pause_from_sync()` from the root actor when called from a `trio.to_thread.run_sync()` bg thread: - root-actor bg threads which can't `Lock._debug_lock.acquire()` since they aren't in `trio.Task`s. - even if scheduled via `.to_thread.run_sync(_debug._pause)` the acquirer won't be the task/thread which calls `Lock.release()` from `PdbREPL` hooks; this results in a RTE raised by `trio`.. - multiple threads will step on each other's stdio since cpython's GIL seems to ctx switch threads on every input from the user to the REPL loop.. Reproduce via reworking our example and test so that they catch and fail for all edge cases: - rework the `/examples/debugging/sync_bp.py` example to demonstrate the above issues, namely the stdio clobbering in the REPL when multiple threads and/or a subactor try to debug simultaneously. \|_ run one thread using a task nursery to ensure it runs conc with the nursery's parent task. \|_ ensure the bg threads run conc a subactor usage of `.pause_from_sync()`. \|_ gravely detail all the special cases inside a TODO comment. \|_ add some control flags to `sync_pause()` helper and don't use `breakpoint()` by default. - extend and adjust `test_debugger.test_pause_from_sync` to match (and thus currently fail) by ensuring exclusive `PdbREPL` attachment when the 2 bg root-actor threads are concurrently interacting alongside the subactor: \|_ should only see one of the `_pause_msg` logs at a time for either one of the threads or the subactor. \|_ ensure each attaches (in no particular order) before expecting the script to exit. Impl adjustments to `.devx._debug`: - drop `Lock.repl`, no longer used. - add `Lock._owned_by_root: bool` for the `.ctx_in_debug == None` root-actor-task active case. - always `log.exception()` for any `._debug_lock.release()` ownership RTE emitted by `trio`, like we used to.. - add special `Lock.release()` log message for the stale lock but `._owned_by_root == True` case; oh yeah and actually `log.devx(message)`.. - rename `Lock.acquire()` -> `.acquire_for_ctx()` since it's only ever used from subactor IPC usage; well that and for local root-task usage we should prolly add a `.acquire_from_root_task()`? - buncha `._pause()` impl improvements: \|_ type `._pause()`'s `debug_func` as a `partial` as well. \|_ offer `called_from_sync: bool` and `called_from_bg_thread: bool` for the special case handling when called from `.pause_from_sync()` \|_ only set `DebugStatus.repl/repl_task` when `debug_func != None` (OW ensure the `.repl_task` is not the current one). \|_ handle error logging even when `debug_func is None`.. \|_ lotsa detailed commentary around root-actor-bg-thread special cases. - when `._set_trace(hide_tb=False)` do `pdbp.set_trace(frame=currentframe())` so the `._debug` internal frames are always included. - by default always hide tracebacks for `.pause[_from_sync]()` internals. - improve `.pause_from_sync()` to avoid root-bg-thread crashes: \|_ pass new `called_from_xxx_` flags and ensure `DebugStatus.repl_task` is actually set to the `threading.current_thread()` when needed. \|_ manually call `Lock._debug_lock.acquire_nowait()` for the non-bg thread case. \|_ TODO: still need to implement the bg-thread case using a bg `trio.Task`-in-thread with an `trio.Event` set by thread REPL exit.	2025-03-24 14:04:52 -04:00
Tyler Goodlet	4bc7569981	Woops, set `post_mortem=False` by default again!	2025-03-24 14:04:52 -04:00
Tyler Goodlet	15a47dc4f7	Finally, officially support shielded REPL-ing! It's been a long time prepped and now finally implemented! Offer a `shield: bool` argument from our async `._debug` APIs: - `await tractor.pause(shield=True)`, - `await tractor.post_mortem(shield=True)` ^-These-^ can now be used inside cancelled `trio.CancelScope`s, something very handy when introspecting complex (distributed) system tear/shut-downs particularly under remote error or (inter-peer) cancellation conditions B) Thanks to previous prepping in a prior attempt and various patches from the rigorous rework of `.devx._debug` internals around typed msg specs, there ain't much that was needed! Impl deats - obvi passthrough `shield` from the public API endpoints (was already done from a prior attempt). - put ad-hoc internal `with trio.CancelScope(shield=shield):` around all checkpoints inside `._pause()` for both the root-process and subactor case branches. Add a fairly rigorous example, `examples/debugging/shielded_pause.py` with a wrapping `pexpect` test, `test_debugger.test_shield_pause()` and ensure it covers as many cases as i can think of offhand: - multiple `.pause()` entries in a loop despite parent scope cancellation in a subactor RPC task which itself spawns a sub-task. - a `trio.Nursery.parent_task` which raises, is handled and tries to enter and unshielded `.post_mortem()`, which of course internally raises `Cancelled` in a `._pause()` checkpoint, so we catch the `Cancelled` again and then debug the debugger's internal cancellation with specific checks for the particular raising checkpoint-LOC. - do ^- the latter -^ for both subactor and root cases to ensure we can debug `._pause()` itself when it tries to REPL engage from a cancelled task scope Bo	2025-03-24 14:04:52 -04:00
Tyler Goodlet	5bab7648e2	Add a `tractor.post_mortem()` API test + example Since turns out we didn't have a single example using that API Bo The test granular-ly checks all use cases: - `.post_mortem()` manual calls in both subactor and root. - ensuring built-in RPC crash handling activates after each manual one from ^. - drafted some call-stack frame checking that i commented out for now since we need to first do ANSI escape code removal due to the colorization that `pdbp` does by default. \|_ added a TODO with SO link on `assert_before()`. Also todo-staged a shielded-pause test to match with the already existing-but-needs-refinement example B)	2025-03-24 14:04:52 -04:00
Tyler Goodlet	d099466d21	Change `reraise` to `post_mortem: bool` in `maybe_expect_raises()`	2025-03-24 14:04:52 -04:00
Tyler Goodlet	4b843d6219	Ensure only a boxed traceback for MTE on parent side	2025-03-24 14:04:51 -04:00
Tyler Goodlet	fa2893cc87	Ensure ctx error-state matches the MTE scenario Namely checking that `Context._remote_error` is set to the raised MTE in the invalid started and return value cases since prior to the recent underlying changes to the `Context.result()` impl, it would not match. Further, - do asserts for non-MTE raising cases in both the parent and child. - add todos for testing ctx-outcomes for per-side-validation policies i anticipate supporting and implied msg-dialog race cases therein.	2025-03-24 14:04:51 -04:00
Tyler Goodlet	9dc7602f21	Fix `test_basic_payload_spec` bad msg matching Expecting `Started` or `Return` with respective bad `.pld` values depending on what type of failure is test parametrized. This makes the suite run green it seems B)	2025-03-24 14:04:51 -04:00
Tyler Goodlet	07ba69f697	Add basic payload-spec test suite Starts with some very basic cases: - verify both subactor-as-child-ctx-task send side validation (failures) as well as relay and raise on root-parent-side-task. - wrap failure expectation cases that bubble out of `@acm`s with a `maybe_expect_raises()` equiv wrapper with an embedded timeout. - add `Return` cases including invalid by `str` and valid by a `None`. Still ToDo: - commit impl changes to make the bulk of this suite pass. - adjust how `MsgTypeError`s format the local (`.started()`) send side `.tb_str` such that we don't do a "boxed" error prior to `pack_error()` being called normally prior to `Error` transit.	2025-03-24 14:04:51 -04:00
Tyler Goodlet	92ac95ce24	Update debugger tests to expect new pformatting Mostly the result of the `RemoteActorError.pformat()` and our new `_pause/crash_msg: str`s which include the `trio.Task.__repr__()` in the `log.pdb()` message. Obvi use the `in_prompt_msg()` to accomplish where not used prior. ToDo later: -[ ] still some outstanding questions on how detailed inceptions should look, eg. in `test_multi_nested_subactors_error_through_nurseries()` \|_maybe we should be more pedantic at checking `.src_uid` vs. `.relay_uid` fields? -[ ] staged a placeholder test for verifying correct call-stack frame on crash handler REPL entry. -[ ] also need a test to verify that you can't pause from an already paused actor task such as can happen if you try to step through runtime code that has a recurrent entry to `._debug.pause()`.	2025-03-24 14:04:51 -04:00
Tyler Goodlet	683288c8db	Update tests for `PldRx` and `Context` changes Mostly adjustments for the new pld-receiver semantics/shim-layer which results more often in the direct delivery of `RemoteActorError`s from IPC API primitives (like `Portal.result()`) instead of being embedded in an `ExceptionGroup` bundled from an embedded nursery. Tossed usage of the `debug_mode: bool` fixture to a couple problematic tests while i was working on them. Also includes detailed assertion updates to the inter-peer cancellation suite in terms of, - `Context.canceller` state correctly matching the true src actor when expecting a ctxc. - any rxed `ContextCancelled` should instance match the `Context._local/remote_error` as should the `.msgdata` and `._ipc_msg`.	2025-03-24 14:04:51 -04:00
Tyler Goodlet	51de6bd1bc	Add a "current IPC `Context`" `ContextVar` Expose it from `._state.current_ipc_ctx()` and set it inside `._rpc._invoke()` for child and inside `Portal.open_context()` for parent. Still need to write a few more tests (particularly demonstrating usage throughout multiple nested nurseries on each side) but this suffices as a proto for testing with some debugger request-from-subactor stuff. Other, - use new `.devx.pformat.add_div()` for ctxc messages. - add a block to always traceback dump on corrupted cs stacks. - better handle non-RAEs exception output-formatting in context termination summary log message. - use a summary for `start_status` for msg logging in RPC loop.	2025-03-24 14:04:51 -04:00
Tyler Goodlet	1c01608c72	More msg-spec tests tidying - Drop `test_msg_spec_xor_pld_spec()` since we no longer support `ipc_msg_spec` arg to `mk_codec()`. - Expect `MsgTypeError`s around `.open_context()` calls when `add_codec_hooks == False`. - toss in some `.pause()` points in the subactor ctx body whilst hacking out a `.pld` protocol for debug mode TTY locking.	2025-03-24 14:04:51 -04:00
Tyler Goodlet	dbebcc54cc	Unify `MsgTypeError` as a `RemoteActorError` subtype Since in the receive-side error case the source of the exception is the sender side (normally causing a local `TypeError` at decode time), might as well bundle the error in remote-capture-style using boxing semantics around the causing local type error raised from the `msgspec.msgpack.Decoder.decode()` and with a traceback packed from `msgspec`-specific knowledge of any field-type spec matching failure. Deats on new `MsgTypeError` interface: - includes a `.msg_dict` to get access to any `Decoder.type`-applied load of the original (underlying and offending) IPC msg into a `dict` form using a vanilla decoder which is normally packed into the instance as a `._msg_dict`. - a public getter to the "supposed offending msg" via `.payload_msg` which attempts to take the above `.msg_dict` and load it manually into the corresponding `.msg.types.MsgType` struct. - a constructor `.from_decode()` to make it simple to build out error instances from a failed decode scope where the aforementioned `msgdict: dict` from the vanilla decode can be provided directly. - ALSO, we now pack into `MsgTypeError` directly just like ctxc in `unpack_error()` This also completes the while-standing todo for `RemoteActorError` to contain a ref to the underlying `Error` msg as `._ipc_msg` with public `@property` access that `defstruct()`-creates a pretty struct version via `.ipc_msg`. Internal tweaks for this include: - `._ipc_msg` is the internal literal `Error`-msg instance if provided with `.ipc_msg` the dynamic wrapper as mentioned above. - `.__init__()` now can still take variable `**extra_msgdata` (similar to the `dict`-msgdata as before) to maintain support for subtypes which are constructed manually (not only by `pack_error()`) and insert their own attrs which get placed in a `._extra_msgdata: dict` if no `ipc_msg: Error` is provided as input. - the `.msgdata` is now a merge of any `._extra_msgdata` and a `dict`-casted form of any `._ipc_msg`. - adjust all previous `.msgdata` field lookups to try equivalent field reads on `._ipc_msg: Error`. - drop default single ws indent from `.tb_str` and do a failover lookup to `.msgdata` when `._ipc_msg is None` for the manually constructed subtype-instance case. - add a new class attr `.extra_body_fields: list[str]` to allow subtypes to declare attrs they want shown in the `.__repr__()` output, eg. `ContextCancelled.canceller`, `StreamOverrun.sender` and `MsgTypeError.payload_msg`. - ^-rework defaults pertaining to-^ with rename from `_msgdata_keys` -> `_ipcmsg_keys` with latter now just loading directly from the `Error` fields def and `_body_fields: list[str]` just taking that value and removing the not-so-useful-in-REPL or already shown (i.e. `.tb_str: str`) field names. - add a new mod level `.pack_from_raise()` helper for auto-boxing RAE subtypes constructed manually into `Error`s which is normally how `StreamOverrun` and `MsgTypeError` get created in the runtime. - in support of the above expose a `src_uid: tuple` override to `pack_error()` such that the runtime can provide any remote actor id when packing a locally-created yet remotely-caused RAE subtype. - adjust all typing to expect `Error`s over `dict`-msgs. Adjust some tests to match these changes: - context and inter-peer-cancel tests to make their `.msgdata` related checks against the new `.ipc_msg` as well and `.tb_str` directly. - toss in an extra sleep to `sleep_a_bit_then_cancel_peer()` to keep the 'canceller' ctx child task cancelled by it's parent in the 'root' for the rte-raised-during-ctxc-handling case (apparently now it's returning too fast, cool?).	2025-03-24 14:04:51 -04:00

1 2 3 4 5 ...

657 Commits (4106ba73eacb772e5953a27bfb361a9f09767988)