Compare commits

...

250 Commits

Author SHA1 Message Date
Tyler Goodlet 23809b8468 Bump "task-manager(-nursery)" naming, add logging
Namely just renaming any `trio.Nursery` instances to `tn`, the primary
`@acm`-API to `.trionics.open_taskman()` and change out all `print()`s
for logger instances with 'info' level enabled by the mod-script usage.
2025-08-20 13:08:39 -04:00
Tyler Goodlet 60427329ee Add a new `.trionics._tn` for "task nursery stuff"
Since I'd like to decouple the new "task-manager-nursery" lowlevel
primitives/abstractions from the higher-level
`TaskManagerNursery`-supporting API(s) and default per-task
supervision-strat and because `._mngr` is already purposed for
higher-level "on-top-of-nursery" patterns as it is.

Deats,
- move `maybe_open_nursery()` into the new mod.
- adjust the pkg-mod's import to the new sub-mod.
- also draft up this idea for an API which stacks `._beg.collapse_eg()`
  onto a nursery with the WIP name `open_loose_tn()` but more then
  likely i'll just discard this idea bc i think the explicit `@acm`
  stacking is more explicit/pythonic/up-front-grokable despite the extra
  LoC.
2025-08-20 13:08:37 -04:00
Tyler Goodlet f946041d44 Add `debug_mode: bool` control to task mngr
Allows dynamically importing `pdbp` when enabled and a way for
eventually linking with `tractor`'s own debug mode flag.
2025-08-20 13:06:43 -04:00
Tyler Goodlet a4339d6ac6 Go all in on "task manager" naming 2025-08-20 13:06:43 -04:00
Tyler Goodlet 9123fbdbfa More refinements and proper typing
- drop unneeded (and commented) internal cs allocating bits.
- bypass all task manager stuff if no generator is provided by the
  caller; i.e. just call `.start_soon()` as normal.
- fix `Generator` typing.
- add some prints around task manager.
- wrap in `TaskOutcome.lowlevel_task: Task`.
2025-08-20 13:06:43 -04:00
Tyler Goodlet e7b3254b7b Ensure user-allocated cancel scope just works!
Turns out the nursery doesn't have to care about allocating a per task
`CancelScope` since the user can just do that in the
`@task_scope_manager` if desired B) So just mask all the nursery cs
allocating with the intention of removal.

Also add a test for per-task-cancellation by starting the crash task as
a `trio.sleep_forever()` but then cancel it via the user allocated cs
and ensure the crash propagates as expected 💥
2025-08-20 13:06:43 -04:00
Tyler Goodlet e468f62c26 Facepalm, don't pass in unecessary cancel scope 2025-08-20 13:06:43 -04:00
Tyler Goodlet 6c65729c20 Do renaming, implement lowlevel `Outcome` sending
As was listed in the many todos, this changes the `.start_soon()` impl
to instead (manually) `.send()` into the user defined
`@task_scope_manager` an `Outcome` from the spawned task. In this case
the task manager wraps that in a user defined (and renamed)
`TaskOutcome` and delivers that + a containing `trio.CancelScope` to the
`.start_soon()` caller. Here the user defined `TaskOutcome` defines
a `.wait_for_result()` method that can be used to await the task's exit
and handle it's underlying returned value or raised error; the
implementation could be different and subject to the user's own whims.

Note that by default, if this was added to `trio`'s core, the
`@task_scope_manager` would simply be implemented as either a `None`
yielding single-yield-generator but more likely just entirely ignored
by the runtime (as in no manual task outcome collecting, generator
calling and sending is done at all) by default if the user does not provide
the `task_scope_manager` to the nursery at open time.
2025-08-20 13:06:43 -04:00
Tyler Goodlet 94fbbe0b05 Alias to `@acm` in broadcaster mod 2025-08-20 13:06:43 -04:00
Tyler Goodlet d5b54f3f5e Initial prototype for a one-cancels-one style supervisor, nursery thing.. 2025-08-20 13:06:43 -04:00
Tyler Goodlet fd314deecb Use shorthand nursery var-names per convention in codebase 2025-08-20 13:06:10 -04:00
Tyler Goodlet dd011c0b2f Better separate service tasks vs. ctxs via methods
Namely splitting the handles for each in 2 separate tables and adding
a `.cancel_service_task()`.

Also,
- move `_open_and_supervise_service_ctx()` to mod level.
- rename `target` -> `ctx_fn` params througout.
- fill out method doc strings.
2025-08-20 13:06:10 -04:00
Tyler Goodlet 087aaa1c36 Mv over `ServiceMngr` from `piker` with mods
Namely distinguishing service "IPC contexts" (opened in a
subactor via a `Portal`) from just local `trio.Task`s started
and managed under the `.service_n` (more or less wrapping in the
interface of a "task-manager" style nursery - aka a one-cancels-one
supervision start).

API changes from original (`piker`) impl,
- mk `.start_service_task()` do ONLY that, start a task with a wrapping
  cancel-scope and completion event.
  |_ ideally this gets factored-out/re-implemented using the
    task-manager/OCO-style-nursery from GH #363.
- change what was the impl of `.start_service_task()` to `.start_service_ctx()`
  since it more explicitly defines the functionality of entering
  `Portal.open_context()` with a wrapping cs and completion event inside
  a bg task (which syncs the ctx's lifetime with termination of the
  remote actor runtime).
- factor out what was a `.start_service_ctx()` closure to a new
  `_open_and_supervise_service_ctx()` mod-func holding the meat of
  the supervision logic.

`ServiceMngr` API brief,
- use `open_service_mngr()` and `get_service_mngr()` to acquire the
  actor-global singleton.
- `ServiceMngr.start_service()` and `.cancel_service()` which allow for
  straight forward mgmt of "service subactor daemons".
2025-08-20 13:06:10 -04:00
Tyler Goodlet 66b7410eab Initial idea-notes dump and @singleton factory idea from `trio`-gitter 2025-08-20 13:06:10 -04:00
Bd c9a55c2d46
Merge pull request #397 from goodboy/post_mortems
Fix root-actor crash handling despite runtime cancellation
2025-08-20 12:45:06 -04:00
Tyler Goodlet 548855b4f5 Comment/docs tweaks per copilot reivew
Add a micro glossary to clarify questioned terms and refine out some
patch specific comment regions.
2025-08-20 12:36:08 -04:00
Tyler Goodlet 5322861d6d Clean out old-commented tn-opens and ipc-server settings checks 2025-08-20 11:35:31 -04:00
Tyler Goodlet 46a2fa7074 Always pass a `tn` to `._server._serve_ipc_eps()`
Turns out we weren't despite the optional `stream_handler_nursery` input
to `Server.listen_on()`; fail over to the `Server._stream_handler_tn`
allocated during server setup in those cases.
2025-08-20 11:30:58 -04:00
Tyler Goodlet bfe5b2dde6 Hide `collapse_eg()` frame as used from `open_root_actor()` 2025-08-20 10:44:42 -04:00
Tyler Goodlet a9f06df3fb Heh, add back `Actor._root_tn`, it has purpose..
Turns out I didn't read my own internals docs/comments and despite it
not being used previously, this adds the real use case: a root,
per-actor, scope which ensures parent comms are the last conc-thing to
be cancelled.

Also, the impl changes here make the test from 6410e45 (or wtv
it's rebased to) pass, i.e. we can support crash handling in the root
actor despite the root-tn having been (self) cancelled.

Superficial adjustments,
- rename `Actor._service_n` -> `._service_tn` everywhere.
- add asserts to `._runtime.async_main()` which ensure that the any
  `.trionics.maybe_open_nursery()` calls against optionally passed
  `._[root/service]_tn` are allocated-if-not-provided (the
  `._service_tn`-case being an i-guess-prep-for-the-future-anti-pattern
  Bp).
- obvi adjust all internal usage to match new naming.

Serious/real-use-case changes,
- add (back) a `Actor._root_tn` which sits a scope "above" the
  service-tn and is either,
  + assigned in `._runtime.async_main()` for sub-actors OR,
  + assigned in `._root.open_root_actor()` for the root actor.
  **THE primary reason** to keep this "upper" tn is that during
  a full-`Actor`-cancellation condition (more details below) we want to
  ensure that the IPC connection with a sub-actor's parent is **the last
  thing to be cancelled**; this is most simply implemented by ensuring
  that the `Actor._parent_chan: .ipc.Channel` is handled in an upper
  scope in `_rpc.process_messages()`-subtask-terms.
- for the root actor this `root_tn` is allocated in `.open_root_actor()`
  body and assigned as such.
- extend `Actor.cancel_soon()` to be cohesive with this entire teardown
  "policy" by scheduling a task in the `._root_tn` which,
  * waits for the `._service_tn` to complete and then,
  * cancels the `._root_tn.cancel_scope`,
  * includes "sclangy" console logging throughout.
2025-08-20 10:18:52 -04:00
Tyler Goodlet ee32bc433c Add a root-already-cancelled crash handling test
Such that we audit the `shield=root_tn.cancel_scope.cancel_called,`
passed to `await debug._maybe_enter_pm()` in the `open_root_actor()`
exit handler block.
2025-08-20 10:18:52 -04:00
Tyler Goodlet 561954594e Add attempt at non-root-parent REPL guarding
I masked it bc it doesn't seem to actually work for the case I was
testing (`emsd` clobbering a `paperboi` in `piker`..) but figured I'd
leave it as a reminder for solving this problem more generally (#320)
since this is likely the place in the code for a soln.

When i tested it in my case it just resulted in a hang around the `with
debug.acquire_debug_lock()` for some reason? Can't remember if the child
ended up being able to REPL without issue though..
2025-08-19 14:15:14 -04:00
Tyler Goodlet 28a6354e81 Set `shield` when `.cancel_called` for root crashes
Such that we handle them despite a cancellation condition. This is
almost always the case, that `root_tn.cancel_scope.cancel_called` is
set, by the time the `debug._maybe_enter_pm()` hits. Previous I guess we
just weren't actually ever REPL-debugging such cases?

TODO, still needs a test obvi!
2025-08-19 14:14:38 -04:00
Tyler Goodlet d1599449e7 Mk `pause_from_sync()` raise `InternalError` on no `greenback` init 2025-08-19 14:14:27 -04:00
Tyler Goodlet 2d27c94dec Hide `_maybe_enter_pm()` frame (again?) 2025-08-19 14:14:27 -04:00
Tyler Goodlet 6e4c76245b Add LoC pattern matches for `test_post_mortem_api` 2025-08-19 14:14:27 -04:00
Bd a6f599901c
Merge pull request #395 from goodboy/to_asyncio_eoc_signal
`to_asyncio` eoc signal: use `trio.EndOfChannel` to indicate (maybe non-graceful) `asyncio.Task` termination
2025-08-19 12:45:23 -04:00
Tyler Goodlet 0fafd25f0d Comment tweaks per copilot review 2025-08-19 12:33:47 -04:00
Tyler Goodlet b74e93ee55 Change one infected-aio test to use `chan` in fn sig 2025-08-18 22:32:51 -04:00
Tyler Goodlet 961504b657 Support `chan.started_nowait()` in `.open_channel_from()` target
That is the `target` can declare a `chan: LinkedTaskChannel` instead of
`to_trio`/`from_aio`.

To support it,
- change `.started()` -> the more appropriate `.started_nowait()` which
  can be called sync from the aio child task.
- adjust the `provide_channels` assert to accept either fn sig
  declaration (for now).

Still needs test(s) obvi..
2025-08-18 22:32:51 -04:00
Tyler Goodlet bd148300c5 Relay `asyncio` errors via EoC and raise from rent
Makes the newly added `test_aio_side_raises_before_started` test pass by
ensuring errors raised by any `.to_asyncio.open_channel_from()` spawned
child-`asyncio.Task` are relayed by any caught `trio.EndOfChannel` by
checking for a new `LinkedTaskChannel._closed_by_aio_task: bool`.

Impl deats,
- obvi add `LinkedTaskChannel._closed_by_aio_task: bool = False`
- in `translate_aio_errors()` always check for the new flag on EOC
  conditions and in such cases set `chan._trio_to_raise = aio_err` such
  that the `trio`-parent-task always raises the child's exception
  directly, OW keep original EoC passthrough in place.
- include *very* detailed per-case comments around the extended handler.
- adjust re-raising logic with a new `raise_from` where we only give the
  `aio_err` priority if it's not already set as to `trio_to_raise`.

Also,
- hide the `_run_asyncio_task()` frame by def.
2025-08-18 22:32:51 -04:00
Tyler Goodlet 4a7491bda4 Add "raises-pre-started" `open_channel_from()` test
Verifying that if any exc is raised pre `chan.send_nowait()` (our
currentlly shite version of a `chan.started()`) then that exc is indeed
raised through on the `trio`-parent task side. This case was reproduced
from a `piker.brokers.ib` issue with a similar embedded
`.trionics.maybe_open_context()` call.

Deats,
- call the suite `test_aio_side_raises_before_started`.
- mk the `@context` simply `maybe_open_context(acm_func=open_channel_from)`
  with a `target=raise_before_started` which,
- simply sleeps then immediately raises a RTE.
- expect the RTE from the aio-child-side to propagate all the way up to
  the root-actor's task right up through the `trio.run()`.
2025-08-18 22:32:51 -04:00
Bd 62415518fc
Merge pull request #394 from goodboy/nursery_cleaning
A bit of (actor) nursery cleaning
2025-08-18 22:32:19 -04:00
Tyler Goodlet 5c7d930a9a Drop unused `Actor._root_n`.. 2025-08-18 22:16:03 -04:00
Tyler Goodlet c46986504d Switch nursery to `CancelScope`-status properties
Been meaning to do this forever and a recent test hang finally drove me
to it Bp

Like it sounds, adopt the "cancel-status" properties on `ActorNursery`
use already on our `Context` and derived from `trio.CancelScope`:

- add new private `._cancel_called` (set in the head of `.cancel()`)
  & `._cancelled_caught` (set in the tail) instance vars with matching
  read-only `@properties`.

- drop the instance-var and instead delegate a `.cancelled: bool`
  property to `._cancel_called` and add a usage deprecation warning
  (since removing it breaks a buncha tests).
2025-08-18 22:16:03 -04:00
Tyler Goodlet e05a4d3cac Enforce named-args only to `.open_nursery()` 2025-08-18 22:16:03 -04:00
Bd a9aa5ec04e
Merge pull request #392 from goodboy/introspect_ipc
Introspect-ipc: some `.ipc` subpkg iface refinements for reading cancel statuses and `Address.__repr__()`
2025-08-18 22:15:40 -04:00
Tyler Goodlet 5021514a6a Disable shm resource tracker via flag on 3.13+
As per the newly added support,
https://docs.python.org/3/library/multiprocessing.shared_memory.html
2025-08-18 22:04:40 -04:00
Tyler Goodlet 79f502034f Don't hard code runtime-dir, read it with `._state.get_rt_dir()` 2025-08-18 21:30:48 -04:00
Tyler Goodlet 331921f612 Hmm disable CRE case for now, causes test fails
So i need to either adjust the tests or figure out if/why this is needed
to avoid the crashing in `pikerd` i found when killin the chart during
a long backfill with `binance` backend..
2025-08-18 21:30:48 -04:00
Tyler Goodlet df0d00abf4 Translate CRE's due to socket-close to tpt-closed
Just like in the BRE case (for UDS) it seems when a peer closes the
(UDS?) socket `trio` instead raises a `ClosedResourceError` which we now
catch and re-raise as a `TransportClosed`. This again results in
`tpt.send()` calls from the rpc-runtime **not** raising when it's known
that the IPC channel is disconnected.
2025-08-18 21:30:48 -04:00
Tyler Goodlet a72d1e6c48 Multi-line-style up the UDS fast-connect handler
Shift around comments and expressions for better reading, assign
`tpt_closed` for easier introspection from REPL during debug oh and fix
the `MsgpackTransport.pformat()` to render '|_peers: 1' .. XD
2025-08-18 21:30:48 -04:00
Tyler Goodlet 5931c59aef Log "out-of-layer" cancellation in `._rpc._invoke()`
Similar to what was just changed for `Context.repr_state`, when the
child task is cancelled but by a different "layer" of the runtime (i.e.
a `Portal.cancel_actor()` / `SIGINT`-to-process canceller) we don't
dump a traceback instead just `log.cancel()` emit.
2025-08-18 21:30:48 -04:00
Tyler Goodlet ba08052ddf Handle "out-of-layer" remote `Context` cancellation
Such that if the local task hasn't resolved but is `trio.Cancelled` and
a `.canceller` was set, we report a `'actor-cancelled'` from
`.repr_state: str`. Bit of formatting to avoid needless newlines too!
2025-08-18 21:30:48 -04:00
Tyler Goodlet 00112edd58 UDS: implicitly create `Address.bindspace: Path`
Since it's merely a local-file-sys subdirectory and there should be no
reason file creation conflicts with other bind spaces.

Also add 2 test suites to match,
- `tests/ipc/test_each_tpt::test_uds_bindspace_created_implicitly` to
  verify the dir creation when DNE.
- `..test_uds_double_listen_raises_connerr` to ensure a double bind
  raises a `ConnectionError` from the src `OSError`.
2025-08-18 21:30:48 -04:00
Tyler Goodlet 1d706bddda Rm `assert` from `Channel.from_addr()`, for UDS we re-created to extract the peer PID 2025-08-18 21:30:48 -04:00
Tyler Goodlet 3c30c559d5 `ipc._uds`: assign `.l/raddr` in `.connect_to()`
Using `.get_stream_addrs()` such that we always (*can*) assign the peer
end's PID in the `._raddr`.

Also factor common `ConnectionError` re-raising into
a `_reraise_as_connerr()`-@cm.
2025-08-18 21:30:48 -04:00
Tyler Goodlet 599020c2c5 Rename all lingering ctx-side bits
As before but more thoroughly in comments and var names finally changing
all,
- caller -> parent
- callee -> child
2025-08-18 21:30:48 -04:00
Tyler Goodlet 50f6543ee7 Add `Channel.closed/.cancel_called`
I.e. the public properties for the private instance var equivs; improves
expected introspection usage.
2025-08-18 21:30:48 -04:00
Tyler Goodlet c0854fd221 Set `Channel._cancel_called` via `chan` var
In `Portal.cancel_actor()` that is, at the least to make it easier to
ref search from an editor Bp
2025-08-18 21:30:48 -04:00
Tyler Goodlet e875b62869 Add `.ipc._shm` todo-idea for `@actor_fixture` API 2025-08-18 21:30:48 -04:00
Tyler Goodlet 3ab7498893 Add todo for py3.13+ `.shared_memory`'s new `track=False` support.. finally they added it XD 2025-08-18 21:30:48 -04:00
Bd dd041b0a01
Merge pull request #393 from goodboy/trionics_tweaks
Trionics tweaks: some `._mngrs` refinements and fix a `test_resource_cache` hang
2025-08-18 21:20:33 -04:00
Tyler Goodlet 4e252526b5 Accept `tn` to `gather_contexts()/maybe_open_context()`
Such that the caller can be responsible for their own (nursery) scoping
as needed and, for the latter fn's case with
a `trio.Nursery.CancelStatus.encloses()` check to ensure the `tn` is
a valid parent-ish.

Some deats,
- in `gather_contexts()`, mv the `try/finally` outside the nursery block
  to ensure we always do the `parent_exit`.
- for `maybe_open_context()` we do a naive task-tree hierarchy audit to
  ensure the provided scope is not *too* child-ish (with what APIs `trio`
  gives us, see above), OW go with the old approach of using the actor's
  private service nursery.
  Also,
  * better report `trio.Cancelled` around the cache-miss `yield`
    cases and ensure we **never** unmask triggering key-errors.
  * report on any stale-state with the mutex in the `finally` block.
2025-08-18 21:07:12 -04:00
Tyler Goodlet 4ba3590450 Add `.trionics.maybe_open_context()` locking test
Call it `test_lock_not_corrupted_on_fast_cancel()` and includes
a detailed doc string to explain. Implemented it "cleverly" by having
the target `@acm` cancel its parent nursery after a peer, cache-hitting
task, is already waiting on the task mutex release.
2025-08-18 21:07:12 -04:00
Tyler Goodlet f1ff79a4e6 Always `finally` invoke cache-miss `lock.release()`s
Since the `await service_n.start()` on key-err can be cancel-masked
(checkpoint interrupted before `_Cache.run_ctx` completes), we need to
always `lock.release()` in to avoid lock-owner-state corruption and/or
inf-hangs in peer cache-hitting tasks.

Deats,
- add a `try/except/finally` around the key-err triggered cache-miss
  `service_n.start(_Cache.run_ctx, ..)` call, reporting on any taskc
  and always `finally` unlocking.
- fill out some log msg content and use `.debug()` level.
2025-08-18 21:07:12 -04:00
Tyler Goodlet 70664b98de Well then, I guess it just needed, a checkpoint XD
Here I was thinking the bcaster (usage) maybe required a rework but,
NOPE it's just bc a checkpoint was needed in the parent task owning the
`tn` which spawns `get_sub_and_pull()` tasks to ensure the bg allocated
`an`/portal is eventually cancel-called..

Ah well, at least i started a patch for `MsgStream.subscribe()` to make
it multicast revertible.. XD

Anyway, I tossed in some checks & notes related to all that unnecessary
effort since I do think i'll move forward implementing it:
- for the `cache_hit` case always verify that the `bcast` clone is
  unregistered from the common state subs after
  `.subscribe().__aexit__()`.
- do a light check that the implicit `MsgStream._broadcaster` is always
  the only bcrx instance left-leaked into that state.. that is until
  i get the proper de-allocation/reversion from multicast -> unicast
  working.
- put in mega detailed note about the required parent-task checkpoint.
2025-08-18 21:07:12 -04:00
Tyler Goodlet 1c425cbd22 Tool-up `test_resource_cache.test_open_local_sub_to_stream`
Since I recently discovered a very subtle race-case that can sometimes
cause the suite to hang, seemingly due to the `an: ActorNursery`
allocated *behind* the `.trionics.maybe_open_context()` usage; this can
result in never cancelling the 'streamer' subactor despite the `main()`
timeout-guard?

This led me to dig in and find that the underlying issue was 2-fold,

- our `BroadcastReceiver` termination-mgmt semantics in
  `MsgStream.subscribe()` can result in the first subscribing task to
  always keep the `MsgStream._broadcaster` instance allocated; it's
  never `.aclose()`ed, which makes it tough to determine (and thus
  trace) when all subscriber-tasks are actually complete and
  exited-from-`.subscribe()`..

- i was shield waiting `.ipc._server.Server.wait_for_no_more_peers()` in
  `._runtime.async_main()`'s shutdown sequence which would then compound
  the issue resulting in a SIGINT-shielded hang.. the worst kind XD

Actual changes here are just styling, printing, and some mucking with
passing the `an`-ref up to the parent task in the root-actor where i was
doing a conditional `ActorNursery.cancel()` to mk sure that was actually
the problem. Presuming this is fixed the `.pause()` i left unmasked
should never hit.
2025-08-18 21:07:06 -04:00
Tyler Goodlet edc2211444 Go multi-line-style tuples in `maybe_enter_context()`
Allows for an inline comment of the first "cache hit" bool element.
2025-08-18 20:55:18 -04:00
Bd b05abea51e
Merge pull request #390 from goodboy/strict_egs_everywhere
Strict egs everywhere: drop use of `strict_exception_groups=False` throughout!
2025-08-18 14:15:49 -04:00
Tyler Goodlet 88c1c083bd Add timeout to inf-streamer test 2025-08-18 13:31:15 -04:00
Tyler Goodlet b096867d40 Remove lingering seg=False-flags from tests 2025-08-18 12:03:32 -04:00
Tyler Goodlet a3c9822602 Remove lingering seg=False-flags from examples 2025-08-18 12:03:10 -04:00
Tyler Goodlet e3a542f2b5 Never shield-wait `ipc_server.wait_for_no_more_peers()`
As mentioned in prior testing commit, it can cause the worst kind of
hangs, the SIGINT ignoring kind.. Pretty sure there was never any reason
outside some esoteric multi-actor debugging case, and pretty sure that
already was solved?
2025-08-18 10:46:37 -04:00
Tyler Goodlet 0ffcea1033 Adjust `test_trio_prestarted_task_bubbles()` suite to expect non-eg raises 2025-08-18 10:46:37 -04:00
Tyler Goodlet a7bdf0486c Styling tweaks to quadruple streaming test fn 2025-08-18 10:46:37 -04:00
Tyler Goodlet d2ac9ecf95 Resolve `test_cancel_while_childs_child_in_sync_sleep`
Was failing due to the `.fail_after()` timeout being *too short* and
somehow the new interplay of that with strict-exception groups resulting
in the `TooSlowError` never raising but instead an eg with the embedded
`AssertionError`?? I still don't really get it honestly..

I've written up lengthy notes around the different `delay` settings that
can be used to see the diff outcomes, the failing case being the one
i still don't really grok and think is justification for `trio` to
bubble inner `Cancelled`s differently possibly?

For now i've included the original failing case as an `xfail`
parametrization for now which will hopefully drive a follow lowlevel
`trio` test in `test_trioisms`!
2025-08-18 10:46:37 -04:00
Tyler Goodlet dcb1062bb8 Fix cluster suite, chng to new `gather_contexts()`
Namely `test_empty_mngrs_input_raises()` was failing due to
lazy-iterator use as input to `mngrs` which i guess i added support for
a while back (by it doing a `list(mngrs)` internally)? So just change it
to `gather_contexts(mngrs=())` and also tweak the `trio.fail_after(3)`
since it appears that the prior 1sec was causing
too-fast-of-a-cancellation (before the cluster fully spawned) and thus
the expected `ValueError` never to show..

Also, mask the `tractor.trionics.collapse_eg()` usage (again?) in
`open_actor_cluster()` since it seems unnecessary.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 05d865c0f1 WIP tinkering with strict-eg-tns and cluster API
Seems that the way the actor-nursery interacts with the
`.trionics.gather_contexts()` API on cancellation makes our
`.trionics.collapse_eg()` not work as intended?

I need to dig into how `ActorNursery.cancel()` and `.__aexit__()` might
be causing this discrepancy..

Consider this a commit-of-my-index type save for rn.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 8218f0f51f Bit of multi-line styling / name tweaks in cancellation suites 2025-08-18 10:46:37 -04:00
Tyler Goodlet 8f19f5d3a8 Mk temp collapser bp work outside runtime as well.. 2025-08-18 10:46:37 -04:00
Tyler Goodlet 64c27a914b Add temp breakpoint support to `collapse_eg()` 2025-08-18 10:46:37 -04:00
Tyler Goodlet d9c8d543b3 Suppress beg tbs from `collapse_eg()`
It was originally this way; I forgot to flip it back when discarding the
`except*` handler impl..

Specially handle the `exc.__cause__` case where we raise from any
detected underlying cause and OW `from None` to suppress the eg's tb.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 048b154f00 Rework `collapse_eg()` to NOT use `except*`..
Since it turns out the semantics are basically inverse of normal
`except` (particularly for re-raising) which is hard to get right, and
bc it's a lot easier to just delegate to what `trio` already has behind
the `strict_exception_groups=False` setting, Bp

I added a rant here which will get removed shortly likely, but i think
going forward recommending against use of `except*` is prudent for
anything low level enough in the runtime (like trying to filter begs).

Dirty deats,
- copy `trio._core._run.collapse_exception_group()` to here with only
  a slight mod to remove the notes check and tb concatting for the
  collapse case.
- rename `maybe_collapse_eg()` - > `get_collapsed_eg()` and delegate it
  directly to the former `trio` fn; return `None` when it returns the
  same beg without collapse.
- simplify our own `collapse_eg()` to either raise the collapsed `exc`
  or original `beg`.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 88828e9f99 Couple more `._root` logging tweaks.. 2025-08-18 10:46:37 -04:00
Tyler Goodlet 25ff195c17 Use collapser around `root_tn` in `async_main()`
Replacing yet another loose-eg-flag. Also toss in a todo to maybe use
the unmasker around the `open_root_actor()` body.
2025-08-18 10:46:37 -04:00
Tyler Goodlet f60cc646ff Facepalm, fix `raise from` in `collapse_eg()`
I dunno what exactly I was thinking but we definitely don't want to
**ever** raise from the original exc-group, instead always raise from
any original `.__cause__` to be consistent with the embedded src-error's
context.

Also, adjust `maybe_collapse_eg()` to return `False` in the non-single
`.exceptions` case, again don't know what I was trying to do but this
simplifies caller logic and the prior return-semantic had no real
value..

This fixes some final usage in the runtime (namely top level nursery
usage in `._root`/`._runtime`) which was previously causing test suite
failures prior to this fix.
2025-08-18 10:46:37 -04:00
Tyler Goodlet a2b754b5f5 Just import `._runtime` ns in `._root`; be a bit more explicit 2025-08-18 10:46:37 -04:00
Tyler Goodlet 5e13588aed Use collapse in `._root.open_root_actor()` too
Seems to add one more cancellation suite failure as well as now cause
the discovery test to error instead of fail?
2025-08-18 10:46:37 -04:00
Tyler Goodlet 0a56f40bab Use collapser around root tn in `.async_main()`
Seems to cause the following test suites to fail however..

- 'test_advanced_faults.py::test_ipc_channel_break_during_stream'
- 'test_advanced_faults.py::test_ipc_channel_break_during_stream'
- 'test_clustering.py::test_empty_mngrs_input_raises'

Also tweak some ctxc request logging content.
2025-08-18 10:46:37 -04:00
Tyler Goodlet f776c47cb4 Drop msging-err patt from `subactor_breakpoint` ex
Since the `bdb` module was added to the namespace lookup set in
`._exceptions.get_err_type()` we can now relay a RAE-boxed
`bdb.BdbQuit`.
2025-08-18 10:46:37 -04:00
Tyler Goodlet 7f584d4f54 Switch to strict-eg nurseries almost everywhere
That is just throughout the core library, not the tests yet. Again, we
simply change over to using our (nearly equivalent?)
`.trionics.collapse_eg()` in place of the already deprecated
`strict_exception_groups=False` flag in the following internals,
- the conc-fan-out tn use in `._discovery.find_actor()`.
- `._portal.open_portal()`'s internal tn used to spawn a bg rpc-msg-loop
  task.
- the daemon and "run-in-actor" layered tn pair allocated in
  `._supervise._open_and_supervise_one_cancels_all_nursery()`.

The remaining loose-eg usage in `._root` and `._runtime` seem to be
necessary to keep the test suite green?? For the moment these are left
out.
2025-08-18 10:46:37 -04:00
Tyler Goodlet d650dda0fa Use collapser in rent side of `Context` 2025-08-18 10:46:37 -04:00
Tyler Goodlet f6598e8400 Add some tooling params to `collapse_eg()` 2025-08-18 10:46:37 -04:00
Bd 59822ff093
Merge pull request #389 from goodboy/better_reprs
Better `repr()`s: more console friendly reprentations of internal primitives
2025-08-16 17:20:02 -04:00
Tyler Goodlet ca427aec7e More prep-to-reduce the `Actor` method-iface
- drop the (never/un)used `.get_chans()`.
- add #TODO for factoring many methods into a new `.rpc`-subsys/pkg
  primitive, like an `RPCMngr/Server` type eventually.
- add todo to maybe mv `.get_parent()` elsewhere?
- move masked `._hard_mofo_kill()` to bottom.
2025-08-16 17:06:23 -04:00
Tyler Goodlet f53aa992af .log: expose `at_least_level()` as `StackLevelAdapter` meth 2025-08-15 17:29:22 -04:00
Tyler Goodlet 69e0afccf0 Use `Address` where possible in (root) actor boot
Namely inside various bootup-sequences in `._root` and `._runtime`
particularly in the root actor to support both better tpt-address
denoting in our logging and as part of clarifying logic around setting
the root's registry addresses which is soon to be much better factored
out of the core and into an explicit subsystem + API.

Some `_root.open_root_actor()` deats,
- set `registry_addrs` to a new `uw_reg_addrs` (uw: unwrapped) to be
  more explicit about wrapped addr types thoughout.
- instead ensure `registry_addrs` are the wrapped types and pass down
  into the root `Actor` singleton-instance.
- factor the root-actor check + rt-vars update (updating the `'_root_addrs'`)
  out of `._runtime.async_main()` into this fn.
- as previous, set `trans_bind_addrs = uw_reg_addrs` in unwrapped form since it will
  be passed down both through rt-vars as `'_root_addrs'` and to
  `._runtim.async_main()` as `accept_addrs` (which is then passed to the
  IPC server).
- adjust/simplify much logging.
- shield the `await actor.cancel(None)  # self cancel` to avoid any
  finally-footguns.
- as mentioned convert the

For `_runtime.async_main()` tweaks,
- expect `registry_addrs: list[Address]|None = None` with appropriate
  unwrapping prior to setting both `.reg_addrs` and the equiv rt-var.
- add a new `.registry_addrs` prop for the wrapped form.
- convert a final loose-eg for the `service_nursery` to use
  `collapse_eg()`.
- simplify teardown report logging.
2025-08-15 17:29:10 -04:00
Tyler Goodlet e275c49b23 Stackscope import fail msg dun need braces.. 2025-08-15 16:34:03 -04:00
Tyler Goodlet 48fbf38c1d Drop duplicated (masked) debugging-`terminate_after`, prolly a rebase slip.. 2025-08-15 16:33:31 -04:00
Tyler Goodlet defd6e28d2 Facepalm, actually use `.log.cancel()`-level to report parent-side taskc.. 2025-08-15 16:31:52 -04:00
Tyler Goodlet 414b0e2bae Update buncha log msg fmting in `.msg._ops`
Mostly just multi-line code styling again: always putting standalone
`'f\n'` on separate LOC so it reads like it renders to console. Oh and
and a level drop to `.runtime()` for rx-msg reports.
2025-08-15 16:30:10 -04:00
Tyler Goodlet d34fb54f7c Update buncha log msg fmting in `._spawn`
Again using `Channel.aid.reprol()`, `.devx.pformat.nest_from_op()` and
 converting to multi-line code style an ' for str-report-contents. Tweak
 some imports to sub-mod level as well.
2025-08-15 16:29:17 -04:00
Tyler Goodlet 5d87f63377 Update buncha log msg fmting in `._portal`
Namely to use `Channel.aid.reprol()` and converting to our newer style
multi-line code style for str-reports.
2025-08-15 16:29:11 -04:00
Tyler Goodlet 0ca3d50602 Use `._supervise._shutdown_msg` in tooling test 2025-08-15 16:29:05 -04:00
Tyler Goodlet 8880a80e3e Use `nest_from_op()`/`pretty_struct` in `._rpc`
Again for nicer console logging. Also fix a double `req_chan` arg bug
when passed to `_invoke` in the `self.cancel()` rt-ep; don't update the
`kwargs: dict` just merge in `req_chan` input at call time.
2025-08-15 16:28:46 -04:00
Tyler Goodlet 7be713ee1e Use `nest_from_op()` in actor-nursery shutdown
Including a new one-line `_shutdown_msg: str` which we mod-var-set for
testing usage and some denoising at `.info()` level. Adjust `Actor()`
instantiating input to the new `.registry_addrs` wrapped addrs property.
2025-08-15 16:28:30 -04:00
Tyler Goodlet 4bd8211abb Add #TODO for `._context` to use `.msg.Aid` 2025-08-15 16:24:35 -04:00
Tyler Goodlet a23a98886c Even more `.ipc.*` repr refinements
Mostly adjusting indentation, noise level, and clarity via `.pformat()`
tweaks more general use of `.devx.pformat.nest_from_op()`.

Specific impl deats,
- use `pformat.ppfmt()/`nest_from_op()` more seriously throughout
  `._server`.
- add a `._server.Endpoint.pformat()`.
- add `._server.Server.len_peers()` and `.repr_state()`.
- polish `Server.pformat()`.
- drop some redundant `log.runtime()`s from `._serve_ipc_eps()` instead
  leaving-them-only/putting-them in the caller pub meth.
- `._tcp.start_listener()` log the bound addr, not the input (which may
  be the 0-port.
2025-08-15 16:24:27 -04:00
Tyler Goodlet 31544c862c More `.ipc.Channel`-repr related tweaks
- only generate a repr in `.from_addr()` when log level is >= 'runtime'.
 |_ add a todo about supporting this optimization more generally on our
   adapter.
- fix `Channel.pformat()` to show unknown peer field line fmt correctly.
- add a `Channel.maddr: str` which just delegates directly to the
  `._transport` like other pass-thru property fields.
2025-08-15 16:24:22 -04:00
Tyler Goodlet 7d320c4e1e Mk `Aid` hashable, use pretty-`.__repr__()`
Hash on the `.uuid: str` and delegate verbatim to
`msg.pretty_struct.Struct`'s equiv method.
2025-08-15 16:24:15 -04:00
Tyler Goodlet 38944ad1d2 Drop `actor_info: str` from `._entry` logs 2025-08-15 16:24:06 -04:00
Tyler Goodlet 9260909fe1 Try `nest_from_op()` in some `._rpc` spots
To start trying out,
- using in the `Start`-msg handler-block to repr the msg coming
  *from* a `repr(Channel)` using '<=)` sclang op.
- for a completed RPC task in `_invoke_non_context()`.
- for the msg loop task's termination report.
2025-08-15 16:23:59 -04:00
Tyler Goodlet c00b3c86ea Hide more `Channel._transport` privates for repr
Such as the `MsgTransport.stream` and `.drain` attrs since they're
rarely that important at the chan level. Also start adopting
a `.<attr>=` style for actual attrs of the type versus a `<name>:
` style for meta-field info lines.
2025-08-15 16:23:54 -04:00
Tyler Goodlet 808a336508 Refine `Actor` status iface, use `Aid` throughout
To simplify `.pformat()` output when the new `privates: bool` is unset
(the default) this adds new public attrs to wrap an actor's
cancellation status as well as provide a `.repr_state: str` (similar to
our equiv on `Context`). Rework `.pformat()` to render a much simplified
repr using all these new refinements.

Further, port the `.cancel()` method to use `.msg.types.Aid` for all
internal `requesting_uid` refs (now renamed with `_aid`) and in all
called downstream methods.

New cancel-state iface deats,
- rename `._cancel_called_by_remote` -> `._cancel_called_by` and expect
  it to be set as an `Aid`.
- add `.cancel_complete: bool` which flags whether `.cancel()` ran to
  completion.
- add `.cancel_called: bool` which just wraps `._cancel_called` (and
  which likely will just be dropped since we already have
  `._cancel_called_by`).
- add `.cancel_caller: Aid|None` which wraps `._cancel_called_by`.

In terms of using `Aid` in cancel methods,
- rename vars with `_aid` suffix in `.cancel()` (and wherever else).
- change `.cancel_rpc_tasks()` input param to `req_aid: msgtypes.Aid`.
- do the same for `._cancel_task()` and (for now until we adjust its
  internals as well) use the `Aid.uid` remap property when assigning
  `Context._canceller`.
- adjust all log msg refs to match obvi.
2025-08-15 16:08:53 -04:00
Tyler Goodlet 679d999185 Add flag to toggle private vars in `Channel.pformat()`
Call it `privates: bool` and only show certain internal instance vars
when set in the `repr()` output.
2025-08-15 16:07:39 -04:00
Tyler Goodlet a8428d7de3 Extend `.msg.types.Aid` method interface
Providing the legacy `.uid -> tuple` style id (since still used for the
`Actor._contexts` table) and a `repr-one-line` method `.reprol() -> str`
for rendering a compact unique actor ID summary (useful in
logging/.pformat()s at the least).
2025-08-15 16:07:39 -04:00
Tyler Goodlet e9f2fecd66 Fix `nest_from_op()` call sigs, already changed upstream
In `._runtime/_root` and since the latest fn-signature changes were
already landed onto main branch via the 65b7956: #384-patch.
2025-07-18 00:35:35 -04:00
Tyler Goodlet 547cf5a210 Drop stale comment from inter-peer suite 2025-07-18 00:35:35 -04:00
Tyler Goodlet b5e3fa7370 Use `nest_from_op()` in some runtime logs for actor-state-repring 2025-07-18 00:35:35 -04:00
Bd cd16748598
Merge pull request #387 from goodboy/the_finally_footgun
Coping with "`finally` footguns": avoiding `trio.Cancelled` exc masking as best we can..
2025-07-17 22:33:33 -04:00
Tyler Goodlet 1af35f8170 Add back loose-tn in `gather_contexts()`, mk tests green 2025-07-16 18:18:34 -04:00
Tyler Goodlet 4569d11052 Move `.is_multi_cancelled()` to `.trioniics._beg`
Since it's for beg filtering, the current impl should be renamed anyway;
it's not just for filtering cancelled excs.

Deats,
- added a real doc string, links to official eg docs and fixed the
  return typing.
- adjust all internal imports to match.
2025-07-16 15:49:18 -04:00
Tyler Goodlet 6ba76ab700 .trionics: link in `finally`-footgun `trio` GH ish 2025-07-15 07:23:21 -04:00
Tyler Goodlet 734dda35e9 Hide `._rpc._errors_relayed_via_ipc()` frame by def 2025-07-15 07:23:21 -04:00
Tyler Goodlet b7e04525cc Always `Cancelled`-unmask ctx endpoint excs
To resolve the recently added and failing
`test_remote_exc_relay::test_unmasked_remote_exc`: never allow
`trio.Cancelled` to mask an underlying user-code exception, ever.

Our first real-world (runtime internal) use case for the new
`.trionics.maybe_raise_from_masking_exc()` such that the failing
test now passes with an properly relayed remote RTE unmasking B)

Details,
- flip the `Context._scope_nursery` to the default strict-eg behaviour
  and instead stack its outer scope with a `.trionics.collapse_eg()`.
- wrap the inner-most scope (after `msgops.maybe_limit_plds()`) with
  a `maybe_raise_from_masking_exc()` to ensure user-code errors are
  never masked by `trio.Cancelled`s.

Some err-reporting refinement,
- always capture any `scope_err` from the entire block for debug
  purposes; report it in the `finally` block's log.
- always capture any suppressed `maybe_re`, output from
  `ctx.maybe_raise()`, and `log.cancel()` report it.
2025-07-15 07:23:21 -04:00
Tyler Goodlet 35977dcebb Adjust ep-masking-suite for the real-use-case
Namely that the more common-and-pertinent case is when
a `@context`-ep-fn contains the `finally`-footgun but without
a surrounding embedded `tn` (which currently still requires its own
scope embedded `trionics.maybe_raise_from_masking_exc()`) which can't
be compensated-for by `._rpc._invoke()` easily. Instead the test is
composed where the `._invoke()`-internal `tn` is the machinery being
addressed in terms of masking user-code excs with `trio.Cancelled`.

Deats,
- rename the test -> `test_unmasked_remote_exc` to reflect what the
  runtime should actually be addressing/solving.
- drop the embedded `tn` from `sleep_n_chkpt_in_finally()` (for now)
  since that case can't currently easily be addressed without the user
  code using its own `trionics.maybe_raise_from_masking_exc()` inside
  the nursery scope.
- as such drop all `tn` related params/logic/usage from the ep.
- add in a `Cancelled` handler block which checks for RTE masking and
  always prints the occurrence loudly.

Follow up,
- obvi this suite will currently fail until the appropriate adjustment
  is made to `._rpc._invoke()` to do the unmasking; coming next.
- we probably still need a case with an embedded user `tn` where if
  the default strict-eg mode is used then a ctxc from the parent might
  cause a non-graceful `Context.cancel()` outcome?
 |_since the embedded user-`tn` will raise
   `ExceptionGroup[trio.Cancelled]` upward despite the parent nursery's
   scope being the canceller, or will a `collapse_eg()` inside the
   `._invoke()` scope handle this as well?
2025-07-15 07:23:21 -04:00
Tyler Goodlet e1f26f9611 Extend `._taskc.maybe_raise_from_masking_exc()`
To handle captured non-egs (when the now optional `tn` isn't provided)
as well as yield up a `BoxedMaybeException` which contains any detected
and un-masked `exc_ctx` as its `.value`.

Also add some additional tooling,
- a `raise_unmasked: bool` toggle for when the caller just wants to
  report the masked exc and not raise-it-in-place of the masker.
- `extra_note: str` which by default is tuned to the default
  `unmask_from = (trio.Cancelled,)` but which can be used to deliver
  custom exception msg content.
- `always_warn_on: tuple[BaseException]` which will always emit
  a warning log of what would have been the raised-in-place-of
  `ctx_exc`'s msg for special cases where you want to report
  a masking case that might not be otherwise noticed by the runtime
  (cough like a `Cancelled` masking another `Cancelled) but which
  you'd still like to warn the caller about.
- factor out the masked-`ext_ctx` predicate logic into
  a `find_masked_excs()` and also use it for non-eg cases.

Still maybe todo?
- rewrapping multiple masked sub-excs in an eg back into an eg? left in
  #TODOs and a pause-point where applicable.
2025-07-15 07:23:21 -04:00
Tyler Goodlet 63c5b7696a Mv `maybe_raise_from_masking_exc()` to `.trionics`
Factor the `@acm`-closure it out of the
`test_trioisms::test_acm_embedded_nursery_propagates_enter_err` suite
for real use internally.
2025-07-15 07:23:21 -04:00
Tyler Goodlet 5f94f52226 Add ctx-ep suite for `trio`'s *finally-footgun*
Deats are documented within, but basically a subtlety we already track
with `trio`'s masking of excs by a checkpoint-in-`finally` can cause
compounded issues with our `@context` endpoints, mostly in terms of
remote error and cancel-ack relay semantics.
2025-07-15 07:23:21 -04:00
Bd 6bf571a124
Merge pull request #385 from goodboy/repl_fixture
Add a `repl_fixture` for debugger-interaction-hooks and a newly re-factored subsys pkg: `tractor.devx.debug`
2025-07-15 07:17:49 -04:00
Tyler Goodlet f5056cdd02 Mk `test_crash_handler_cms` suite go green
Turns out there were some subtle internal bugs discovered by the just
added `tests/devx/test_tooling::test_crash_handler_cms` suite.

So this fixes,

- a mis-ordering around `rt_repl_fixture :=` in the logic of
  `DebugStatus.maybe_enter_repl_fixture()`.

- `.devx.debug._post_mortem._post_mortem()` ensuring we now **always**
  call `DebugStatus.release()`, and thus unwind the exist-stack managing
  the `repl_fixture` exit/teardown, **even in the case** where
  `yield False` is delivered from the user-fixture-fn (meaning
  `dnter_repl=False`) thus triggering an early `return` (as is done in
  the new test suite).
2025-07-14 18:07:50 -04:00
Tyler Goodlet 9ff448faa3 Add `open_crash_handler()` / `repl_fixture` suite
Nicely nailing 2 birds by leveraging the new `repl_fixture` support to
actually avoid use of a `pexpect`-style test B)

Functionality audit summary,
- ensures `open_crash_handler() as bxerr:` adheres to,
  - `raise_on_exit` semantics including only raising from a list of exc-types,
  - `repl_fixture` setup/teardown invocation and that `yield False` blocks REPL
    interaction,
  - delivering a `BoxedMaybeException` with the correct state set post
    crash.
  - all the above outside the actor-runtime existing.

Also luckily enough, this seems to have found a bug for which a fix is
coming right up!
2025-07-14 17:55:18 -04:00
Tyler Goodlet 760b9890c4 Add `debugging/subactor_bp_in_ctx.py` test set
It's been in the debug scripts quite a while without a wrapping test and
will be,
- only the 2nd such REPL test which uses a lower-level `@context` ep-API
- the first official and explicit use of `enable_transports=['uds']`
  a suite.

Deats,
- flip to 'uds' tpt and 'devx' level logging in the script.
- add a new 2-case suite `test_ctxep_pauses_n_maybe_ipc_breaks` which
  validates both the quit-early (via `BdbQuit`) and
  channel-dropped-need-to-ctlc cases from a single test fn.
2025-07-14 13:15:07 -04:00
Tyler Goodlet d000642462 Report `enable_stack_on_sig` on `stackscope` import failure 2025-07-14 13:15:07 -04:00
Tyler Goodlet dd69948744 Reapply `.devx.debug` mod-name change to ipc-server lost during rebase.. 2025-07-14 00:00:13 -04:00
Tyler Goodlet 5b69975f81 Drop "
" from tail of `BoxedMaybeException.pformat()`
2025-07-14 00:00:13 -04:00
Tyler Goodlet 6b474743f9 Drop `.to_asyncio`s usage-of-`greenback`-reporting to `log.devx()` 2025-07-14 00:00:13 -04:00
Tyler Goodlet 5ac229244a Disable `greenback` sync fn breakpointing by def
Opting for performance over broad multi-actor "debug-ability" from
sync-function-contexts when `debug_mode=True` is set;
IOW prefer no behind-the-scenes `greenlet` perf impact over being
able to use an actor-safe `breakpoint()` wherever as per,
https://greenback.readthedocs.io/en/latest/principle.html#performance

Adjust the breakpoint restore ex script to match.
2025-07-14 00:00:13 -04:00
Tyler Goodlet bbd2ea3e4f Prevent `test_breakpoint_hook_restored` subproc hangs
If the underlying example script fails (say due to a console output
pattern-mismatch, `AssertionError`) the `pexpect` managed subproc with
a `debug_mode=True` crash-handling-REPL engaged will ofc *not terminate*
due to any SIGINT sent by the test harnesss (since we shield from it as
part of normal sub-actor debugger operation). So instead always send
a 'continue' cmd to the active `PdbREPL`'s stdin so it deactivates and
allows the py-script-process to raise and terminate, unblocking the
`pexpect.spawn`'s internal subproc joiner (which would otherwise hang
without manual intervention, blocking downstream tests..).

Also, use the new `PexpectSpawner` type alias after actually importing
future annots.. XD
2025-07-14 00:00:13 -04:00
Tyler Goodlet 6b903f7746 Type alias our `pexpect.spawn()` closure fixture
Such that we can more easily annotate any consumer test's of our
`.tests.devx.conftest.spawn()` fixture which delivers a closure which, when
called in a test fn body, transitively sub-invokes:
`pytest.Pytester.spawn()` -> `pexpect.spawn()`

IMO Expecting `Callable[[str], pexpect.pty_spawn.spawn]]` to be used all
over is a bit too.. verbose?
2025-07-14 00:00:13 -04:00
Tyler Goodlet 2280bad135 Type annot the `testdir` fixture 2025-07-14 00:00:13 -04:00
Tyler Goodlet 8d506796ec Re-impl as `DebugStatus.maybe_enter_repl_fixture()`
Dropping the `_maybe_open_repl_fixture()` approach and instead using
a `DebugStatus._fixture_stack = ExitStack()` which provides for much
simpler support around both sync and async pausing APIs thanks to only
invoking `repl_fixture.__exit__()` on actual `PdbREPL` interaction being
complete!

Deats,
- all `repl_fixture` detection logic still happens in one place (the new
  method) but we aren't limited to closing it via an immediate post REPL
  `.__exit__()` call which instead is triggered by,
- `DebugStatus.release()` which now calls `._fixture_stack.close()` and
  thus only invokes `repl_fixture.__exit__()` when user REPL-ing is
  **actually complete** an arbitrary amount of debugging time later.
- include the notes for `@acm` support above the new method, though not
  sure if they're as relevant any more?

Benefits,
- we can drop the previously added indent levels from
  `_enter_repl_sync()` and `_post_mortem()`.
- now we automatically have support for the `.pause_from_sync()` API
  since `_enter_repl_sync()` doesn't close the prior
  `_maybe_open_repl_fixture()` immediately when `debug_func=None`; the
  user's `__exit__()` is only ever called once `.release()` is.

Other,
- add big 'CASE' comments around the various blocks in
  `.pause_from_sync()`, i was having trouble figuring out which i was
  using from a `breakpoint()` in a dependent app..
2025-07-14 00:00:12 -04:00
Tyler Goodlet 02d03ce700 Always pass `repl: PdbREPL` as first param to fixture 2025-07-14 00:00:12 -04:00
Tyler Goodlet 9786e2c404 Adjust restore-bp-ex import path to `.devx.debug`
Reversion of original cherry-pick fix from downstream history;
`.devx.debug` is now legit here.
2025-07-14 00:00:12 -04:00
Tyler Goodlet 116137d066 Reorg `.devx.debug` into sub-mods!
Which cleans out the pkg-mod to just the expected exports with (its
longstanding todo comment list) and thus a separation-of-concerns
and smaller mod-file sizes via the following new sub-mods:
- `._trace` for the `.pause()`/`breakpoint()`/`pdb.set_trace()`-style
  APIs including all sync-caller variants.
- `._post_mortem` to contain our async `.post_mortem()` and all other
  public crash handling APIs for use from sync callers.
- `._sync` for the high-level syncing helper-routines used throughout the
  runtime to avoid multi-proc TTY use collisions.

And also,
- remove `hide_runtime_frames()` since moved to `.devx._frame_stack`.
2025-07-14 00:00:12 -04:00
Tyler Goodlet 7f87b4e717 Mv `.hide_runtime_frames()` -> `.devx._frame_stack`
A much more relevant module for a call-stack-frame hider ;)
2025-07-14 00:00:12 -04:00
Tyler Goodlet bb17d39c4e Cherry-pick conflict resolution
Orig commit was,
"9c0de24 Be explicit with `SpawnSpec` processing in subs"

The commit was picked onto an upstream branch but at that time there was
no `.devx.debug` subpkg yet, hence this revert to the original patch's
module path.
2025-07-14 00:00:12 -04:00
Tyler Goodlet fba6edfe9a Enable new `tractor.devx.debug._tty_lock` in the root 2025-07-14 00:00:12 -04:00
Tyler Goodlet e4758550f7 Start splitting into `devx.debug.` sub-mods
From what was originall the `.devx._debug` monolith module, since that
file was way out of ctl in terms of LoC!

New modules so far include,
- ._repl: our `pdb[p]` ext type/lowlevel-APIs and `mk_pdb()` factory.
- ._sigint: just our REPL-interaction shield-handler.
- ._tty_lock: containing all the root-actor TTY mutex machinery
  including the `Lock`/`DebugStatus` primitives/APIs as well as the
  inter-tree IPC context eps:
  * the server-side `lock_stdio_for_peer()` which pairs with the,
  * client-(subactor)-side `request_root_stdio_lock()` via the,
  * pld-msg-spec of `LockStatus/LockRelease`.
  AND the `any_connected_locker_child()` predicate.
2025-07-14 00:00:12 -04:00
Tyler Goodlet a7efbfdbc2 Add `_maybe_open_repl_fixture()`
Factoring the (basically duplicate) content from both use spots into
a common `@cm` which delivers a `bool` signalling whether the REPL
should be engaged. Fixes a lingering bug with `nullcontext()` calling
btw..
2025-07-14 00:00:12 -04:00
Tyler Goodlet 1c6660c497 Mk `.devx._debug` a sub-pkg `.devx.debug`
With plans for much factoring of the original module into sub-mods!
Adjust all imports and refs throughout to match.
2025-07-14 00:00:12 -04:00
Tyler Goodlet 202befa360 Add exc suppression to `open_crash_handler()`
By supporting a new optional param to `open_crash_handler()`,
`raise_on_exit: bool|Sequence[Type[BaseException]] = True` which
determines whether, after the REPL interaction completes, the handled
exception is raised upward. This is **very** handy for writing bits of
"debug-able but resilient code" as is the case in (many) dependent
projects/apps.

Impl,
- `raise_on_exit` can be a `bool` or (set) sequence of types which will
  always be raised.
- also add a `BoxedMaybeException.raise_on_exit` equiv which (for now)
  we check matches (in case down the road we want to offer dynamic ctls).
- rename both crash-handler cm's `tb_hide` -> `hide_tb`.
2025-07-14 00:00:12 -04:00
Tyler Goodlet c24708b273 Add initial `repl_fixture` support B)
It turns out to be fairly useful to allow hooking into a given actor's
entry-and-exit around `.devx._debug._pause/._post_mortem()` calls which
engage the `pdbp.Pdb` REPL (really our `._debug.PdbREPL` but yeah).

Some very handy use cases include,
- swapping out-of-band (config) state that may otherwise halt the
  user's app since the actor normally handles kb&mouse input, in thread,
  which means that the handler will be blocked while the REPL is in use.
- (remotely) reporting actor-runtime state for monitoring purposes
  around crashes or pauses in normal operation.
- allowing for crash-handling to be hard-disabled via
  `._state._runtime_vars` say for when you never want a debugger to be
  entered in a production instance where you're not-sure-if/don't-want
  per-actor `debug_mode: bool` settings to always be unset, say bc
  you're still debugging some edge cases that ow you'd normally want to
  REPL up.

Impl details,
- add a new optional `._state._runtime_vars['repl_fixture']` field which
  for now can be manually set; i saw no reason for a formal API yet
  since we want to convert the `dict` to a struct anyway (first).
- augment both `.devx._debug._pause()/._post_mortem()` with a new
  optional `repl_fixture: AbstractContextManager[bool]` kwarg which
  when provided is `with repl_fixture()` opened around the lowlevel
  REPL interaction calls; if the enter-result, an expected `bool`, is
  `False` then the interaction is hard-bypassed.
  * for the `._pause()` case the `@cm` is opened around the entire body
    of the embedded `_enter_repl_sync()` closure (for now) though
    ideally longer term this entire routine is factored to be a lot less
    "nested" Bp
  * in `_post_mortem()` the entire previous body is wrapped similarly
    and also now excepts an optional `boxed_maybe_exc: BoxedMaybeException`
    only passed in the `open_crash_handler()` caller case.
- when the new runtime-var is overridden, (only manually atm) it is used
  instead but only whenever the above `repl_fixture` kwarg is left null.
- add a `BoxedMaybeException.pformat() = __repr__()` which when
  a `.value: Exception` is set renders a more "objecty" repr of the exc.

Obviously tests for all this should be coming soon!
2025-07-14 00:00:12 -04:00
Tyler Goodlet 3aee702733 Add a `debug_mode`-state reversion test 2025-07-14 00:00:12 -04:00
Tyler Goodlet a573c3c9a8 Unset debug-mode on root actor exit
Discovered this bug while testing `modden`'s daemon under various
cancelled-while-booting race conditions where sequential tests would
fail a lingering `assert 0` inside `.to_asyncio.run_as_asyncio_guest()`
to (oddly) catch redundant greenback-re-inits..

XD

Needs a test likely ;P
2025-07-14 00:00:12 -04:00
Tyler Goodlet 6a352fee87 Expose `.trionics.maybe_collapse_eg` 2025-07-14 00:00:12 -04:00
Tyler Goodlet 6cb361352c Use `.is_debug_mode()` for maybe-crash-handling
Such that the default is `None` and in the case where the caller *does
not* set the `pdb` arg to an explicit `bool` we instead determine it via
the output from `._state.is_debug_mode()` allowing for more "nonchalant"
usage throughout a (test) code base which passes the `debug_mode: bool`
as runtime config; allows delegation to the per-actor proc-global state.
2025-07-14 00:00:12 -04:00
Tyler Goodlet 7807ffaabe Add todo for `dulwhich` as dep 2025-07-14 00:00:12 -04:00
Bd 65b795612c
Merge pull request #384 from goodboy/sclang_pformating
A "sclang" pformat-ing, `.msg.pretty_struct.pformat` and `modden.repr` utils bonanza
2025-07-13 23:59:44 -04:00
Tyler Goodlet a42c1761a8 Refactor `pretty_struct.pformat()` rendering
Adding an underlying `iter_struct_ppfmt_lines()` (which can also be
called directly) to iter-render field-lines-pairs; it's now called from
the top level `.pformat()`. The use case is to allow more granular
control for rendering runtime primitive (like `Actor.pformat()`) reprs
depending on log-level/config, oh and using `textwrap` for indenting.
2025-07-13 23:33:47 -04:00
Tyler Goodlet 359d732633 Fix ref-err on `logger` input to `get_console_log()`
Particularly on a get-attr of `StackLevelAdapter.handlers` which, when
a `logger: StackLevelAdapter` is passed, we need to *not call* our own
`get_logger()` and just set is as the `log`. Fix the typing to match.
2025-07-13 23:33:47 -04:00
Tyler Goodlet b09e35f3dc Mv in `modden.repr` content: some `reprlib`-utils
Since I'd like to use some `reprlib` formatting which `modden` already
implemented (and it's a main dependee project), figured I'd just bring
it all into `.devx.pformat` for now.
2025-07-13 23:33:47 -04:00
Tyler Goodlet 6618b004f4 Adjust `nest_from_op()` usage to match new fn-sig 2025-07-13 23:33:47 -04:00
Tyler Goodlet fc57a4d639 Formally add a `nest_from_op()` for "sclang"-fmting
Moving it from where i (oddly) first wrote it up in `._entry` to a more
proper place with its pals in `.devx.pformat` ;p

Iface summary, what caller provides:
- `input_op`: a "sclang" chars-symbol to represent the conc "operation",
- `text`: the "entity" being *operated on*,
- `nest_prefix/indent`: what the ^ will be prefixed with.
- `prefix_op: bool` so that, when unset, the `input_op` is instead
   used as a suffix to the first line of `text`.
- `next_indent: int|None = None` such that when set (and not null) we
   use that exact ws-indent instead of calculating it from the
   `len(nest_prefix)` allowing for specifying a `0`-indent easily.
  - includes logic where we either get a non-zero value and
    apply it strictly to both the `nest_prefix` and `text`, OR we
    auto-calc it from any `nest_prefix`, NOT a conflation of both..
- `op_suffix: str = '\n'` for instead of assuming `f'{input_op}\n'`.
- `rm_from_first_ln: str` which allows removing chars from the
  first line of `text` after a `str.strip()` (handy for removing
  the '<Channel' first chevron from type-reprs).

There's also a huge comment-doc for "sclang" into the fn body which is
the terrible "primer" on this whole idea for the moment XD
2025-07-13 23:27:03 -04:00
Bd 2248ffb74f
Merge pull request #380 from goodboy/multi_ipc_testing
Multi ipc testing
2025-07-13 19:19:50 -04:00
Tyler Goodlet 1eb0d785a8 Try out separate readme section for infra badges
Bc why clutter the intro like everyone else.. instead put them just
above the install section.
2025-07-13 15:48:26 -04:00
Tyler Goodlet 98d0ca88e5 Flip a couple more debug scripts to UDS tpt
For now just as sanity that we're not breaking anything on that
transport backend (since just a little while back there were issues with
crash handling in subs..) when it comes to crash-REPLing.
2025-07-13 15:26:37 -04:00
Tyler Goodlet 37f843a128 Add an `enable_transports` test-suite
Like it sounds, verifying that when that param is passed to the runtime
startup eps (`.open_root_actor()/.open_nursery()`), the appropriate
tpt-protocol is deployed for IPC (both the server and bound endpoints)
in both the root and any sub-actors (as passed down from rent to child
via the `.msg.types.SpawnSpec`).
2025-07-13 15:26:37 -04:00
Tyler Goodlet 29cd2ddbac Drop 'IPC' prefix from `._server` types
We already have the `.ipc` sub-pkg name so it seems a bit
redundant/noisy for a namespace path Bp

Leave an alias for the `Server` rn since it's already used in a few
other internal mods.. will likely rename later if everyone is cool with
it..
2025-07-13 15:26:37 -04:00
Tyler Goodlet 295b06511b Plugin-ize some re-usable `conftest` parts
Namely any CLI driven runtime-config fixtures such as,

- `--spawn-backend` and `start_method`,
- `--tpdb` and `debug_mode`,
- `--tpt-proto` and `tpt_protos`/`tpt_proto`,
- `reg_addr` as driven by the above.

This moves all fixtures and necessary hook funcs (CLI parsing,
configuring and test-gen) to the `._testing.pytest` module and thus
allows any dependent project to leverage these fixtures in their own
test suites after pointing to that plugin mod using,

```python
    # conftest.py
    pytest_plugins: tuple[str] = (
        "tractor._testing.pytest",
    )
```

Also, add a new `._testing.addr` helper mod which now contains
a factored `get_rando_addr()` helper for creating test-sesh unique
tpt-specific registry (or other) IPC endpoint addrs.
2025-07-13 15:26:37 -04:00
Tyler Goodlet 1e6b5b3f0a Start a very basic ipc-server unit test suite
For now it just boots a server, parametrized over all tpt-protos, sin
any actor runtime bootup. Obvi the future todo is ensuring it all works
with a client connecting via the equivalent lowlevel
`.ipc._chan._connect_chan()` API(s).
2025-07-13 15:26:37 -04:00
Tyler Goodlet 36ddb85197 Fix assert on `.devx.maybe_open_crash_handler()` delivered `bxerr` 2025-07-13 15:26:37 -04:00
Tyler Goodlet d6b0ddecd7 Improve bit of tooling for `test_resource_cache.py`
Namely while what I was actually trying to solve was why
`TransportClosed` was getting raised from `Portal.cancel_actor()` but
still useful edge case auditing either way. Also opts into the
`debug_mode` fixture with apprope timeout adjustment B)
2025-07-13 15:26:37 -04:00
Tyler Goodlet 9e5475391c Set `_state._def_tpt_proto` in `tpt_proto` fixture
Such that the global test-session always (and only) runs against the CLI
specified `--tpt-proto=` transport protocol.
2025-07-13 15:26:37 -04:00
Tyler Goodlet ef7ed7ac6f Handle unconsidered fault-edge cases for UDS
In `tests/test_advanced_faults.py` that is.
Since instead of zero-responses like we'd expect from a network-socket
we actually can get a few differences from the OS when "everything IPC
is known"

XD

Namely it's about underlying `trio` exceptions versus how we wrap them
and how we expect to box them. A `TransportClosed` boxing improvement
is coming in follow up btw to make this all work!

B)
2025-07-13 15:26:37 -04:00
Tyler Goodlet d8094f4420 Woops, ensure we use `global` before setting `daemon()` fixture spawn delay.. 2025-07-13 15:26:37 -04:00
Tyler Goodlet d7b12735a8 Support multiple IPC transports in test harness!
Via a new accumulative `--tpt-proto` arg you can select which
`tpt_protos: list[str]`-fixture protocol keys will be delivered to
opting in tests!

B)

Also includes,
- CLI quote handling/stripping.
- default of 'tcp'.
- only support one selection per session at the moment (until we figure
  out how we want to support multiples, either simultaneously or
  sequentially).
- draft a (masked) dynamic-`metafunc` parametrization in the
  `pytest_generate_tests()` hook.
- first proven and working use in the `test_advanced_faults`-suite (and
  thus its underlying
  `examples/advanced_faults/ipc_failure_during_stream.py` script)!
 |_ actually needed this to prove that the suite only has 2 failures on
    'uds' seemingly due to low-level `trio` error semantics translation
    differences to do with with calling `socket.close()`..

On a very nearly related topic,
- draft an (also commented out) `set_script_runtime_args()` fixture idea
  for a std way of `partial`-ling in runtime args to `examples/`
  scripts-as-modules defining a `main()` which would proxy to
  `tractor.open_nursery()`.
2025-07-13 15:26:37 -04:00
Tyler Goodlet 47107e44ed Start protoyping multi-transport testing
Such that we can run (opting-in) tests on both TCP and UDS backends and
ensure the `reg_addr` fixture and various timeouts are adjusted
accordingly.

Impl deats,
- add a new `tpc_proto` CLI option and fixture to allow choosing which
  "transport protocol" will be used in the test suites (either globally
  or contextually).
- rm `_reg_addr` instead opting for a `_rando_port` which will only be
  used for `reg_addr`s which are net-tpt-protos.
- rejig `reg_addr` fixture to set a ideally session-unique `testrun_reg_addr`
  based on the `tpt_proto` setting making appropriate calls to `._addr`
  APIs as needed.
- refine `daemon` fixture a bit with typing, `tpt_proto` timings, and
  stderr capture.
- in `test_discovery` do a ton of type-annots, add `debug_mode` fixture
  opt ins, augment `spawn_and_check_registry()` with `psutil.Process`
  passing for introspection (when things go wrong..).
2025-07-13 15:26:37 -04:00
Bd ba384ca83d
Merge pull request #375 from goodboy/structural_dynamics_of_flow
The Structural Dynamics of Flow: finally a formalized modular transport layer -> `tractor.ipc` 😎
2025-07-13 15:11:00 -04:00
Tyler Goodlet ad9833a73a Update actions badge links in readme 2025-07-13 14:56:57 -04:00
Tyler Goodlet 161884fbf1 Adjust back `.devx._debug` import
Bc this history is pre `.devx.debug` subpkg creation..
2025-07-13 13:56:37 -04:00
Tyler Goodlet c2e7dc7407 Avoid silent `stackscope`-test fail due to dep
Oddly my env was borked bc a missing sub-dep (`typing-extensions`
apparently not added by `uv` for `stackscope`?) and then `stackscope`
was silently failing import and caused the shield-pause test to also
fail (since it couldn't match the expected `log.devx()` on console). The
import failure is not very explanatory due to the `log.warning()`;
change it to `.error()` level.

Also, explicitly import `_sync_pause_from_builtin` in
`examples/debugging/restore_builtin_breakpoint.py` to ensure the ref is
exported properly from `.devx.debug` (which it wasn't during dev of the
prior commit Bp).
2025-07-13 13:45:15 -04:00
Tyler Goodlet 309360daa2 Add latest `typing-extension`, needed by `stackscope` 2025-07-13 13:43:48 -04:00
Tyler Goodlet cbfb0d0144 Don't use `uv sync --locked` for now 2025-07-13 13:26:22 -04:00
Tyler Goodlet c0eef3bac3 Bump GH CI to use `uv` throughout!
See docs: https://docs.astral.sh/uv/guides/integration/github/

Summary,
- drop `mypy` job for now since I'd like to move to trying `ty`.
- convert sdist built to `uv build`
- just run test suite on py3.13 for now, not sure if 3.12 will break due
  to the eg stuff or not?
2025-07-13 13:23:37 -04:00
Tyler Goodlet 27e6ad18ee Mk `.ipc._tcp.TCPAddress` validate with `ipaddress`
Both via a post-init method to validate the original input `._host: str`
and in `.is_valid` to ensure the host-part isn't something, esoteric..
2025-07-10 18:12:50 -04:00
Tyler Goodlet 28e32b8f85 Use `enable_transports: list[str]` parameter
Actually applying the input it in the root as well as all sub-actors by
passing it down to sub-actors through runtime-vars as delivered by the
initial `SpawnSpec` msg during child runtime init.

Impl deats,
- add a new `_state._runtime_vars['_enable_tpts']: list[str]` field set
  by the input param (if provided) to `.open_root_actor()`.
- mk `current_ipc_protos()` return the runtime-var entry with instead
  the default in the `_runtime_vars: dict` set to `[_def_tpt_proto]`.
- in `.open_root_actor()`, still error on this being a >1 `list[str]`
  until we have more testing infra/suites to audit multi-protos per
  actor.
- return the new value (as 3rd element) from `Actor._from_parent()` as
  per the todo note; means `_runtime.async_main()` will allocate
  `accept_addrs` as tpt-specific `Address` entries and pass them to
  `IPCServer.listen_on()`.

Also,
- also add a new `_state._runtime_vars['_root_addrs']: list = []` field
  with the intent of fully replacing the `'_root_mailbox'` field since,
  * it will need to be a collection to support multi-tpt,
  * it's a more cohesive field name alongside `_registry_addrs`,
  * the root actor of every tree needs to have a dedicated addr set
    (separate from any host-singleton registry actor) so that all its
    subs can contact it for capabilities mgmt including debugger
    access/locking.
- in the root, populate the field in `._runtime.async_main()` and for
  now just set '_root_mailbox' to the first entry in that list in
  anticipation of future multi-homing/transport support.
2025-07-10 18:12:28 -04:00
Tyler Goodlet 05df634d62 Use `Channel.aid: Aid` throughout `.ipc._server` 2025-07-10 17:48:13 -04:00
Tyler Goodlet 6d2f4d108d Detail the docs on `Context._maybe_raise_remote_err()` 2025-07-10 17:48:02 -04:00
Tyler Goodlet ae2687b381 Bump lock file for new 3.13 wheels/schema
Buncha either new AOTc lib whls and they added an `upload-time` field.
2025-07-10 17:47:53 -04:00
Tyler Goodlet a331f6dab3 Return `Path` from `_get_mod_abspath()` helper fn 2025-07-10 17:47:42 -04:00
Tyler Goodlet 9c0de24899 Be explicit with `SpawnSpec` processing in subs
As per the outstanding TODO just above the redic `setattr()` loop in
`Actor._from_parent()`!!

Instead of all that risk-ay monkeying, add detailed comment-sections
around each explicit assignment of each `SpawnSpec` field, including
those that were already being explicitly set.

Those and other deats,
- ONLY enable the `.devx._debug` (CHERRY-CONFLICT: later changed to
  `.debug._tty_lock`) module from `Actor.__init__()` in the root actor.
- ONLY enable the `.devx.debug._tty_lock` module from `Actor.__init__()`
  in the root actor.
- add a new `get_mod_nsps2fps()` to replace the loop in init and assign
  the initial `.enable_modules: dict[str, str]` from it.
- do `self.enable_modules.update(spawnspec.enable_modules)` instead of
  an overwrite and assert the table is by default empty in all
  subs.
2025-07-10 17:42:36 -04:00
Tyler Goodlet 1f3cef5ed6 Fix now invalid `Actor._peers` ref.. 2025-07-09 23:09:41 -04:00
Tyler Goodlet 8538a9c591 Decouple actor-state from low-level ipc-server
As much as is possible given we currently do some graceful
cancellation join-waiting on any connected sub-actors whenever an active
`local_nursery: AcrtorNursery` in the post-rpc teardown sequence of
`handle_stream_from_peer()` is detected. In such cases we try to allow
the higher level inter-actor (task) context(s) to fully cancelled-ack
before conducting IPC machinery shutdown.

The main immediate motivation for all this is to support unit testing
the `.ipc._server` APIs but in the future may be useful for anyone
wanting to use our modular IPC transport layer sin-"actors".

Impl deats,
- drop passing an `actor: Actor` ref from as many routines in
  `.ipc._server` as possible instead opting to use
  `._state.current_actor()` where abs needed; thus the fns dropping an
  `actor` input param are:
  - `open_ipc_server()`
  - `IPCServer.listen_on()`
  - `._serve_ipc_eps()`
  - `.handle_stream_from_peer()`
- factor the above mentioned graceful remote-cancel-ack waiting into
  a new `maybe_wait_on_canced_subs()` which is called from
  `handle_stream_from_peer()` and delivers a
  maybe-`local_nursery: ActorNursery` for downstream logic; it's this
  new fn which primarily still needs to call `current_actor()`.
- in `handle_stream_from_peer()` also use `current_actor()` to check if
  a handshake is needed (or if it was called as part of some
  actor-runtime-less operation like our unit test suite!).
- also don't pass an `actor` to `._rpc.process_messages()` see how-n-why
  below..

Surrounding ipc-server client/caller adjustments,
- `._rpc.process_messages()` no longer takes an `actor` input and
  now calls `current_actor()` instead.
- `._portal.open_portal()` is adjusted to ^.
- `._runtime.async_main()` is adjusted to the `.ipc._server`'s removal
  of `actor` ref passing.

Also,
- drop some server `log.info()`s to `.runtime()`
2025-07-08 18:05:05 -04:00
Tyler Goodlet 7533e93b0f Log listener bind status for TCP as for UDS 2025-07-08 18:05:05 -04:00
Tyler Goodlet f67b0639b8 Move peer-tracking attrs from `Actor` -> `IPCServer`
Namely transferring the `Actor` peer-`Channel` tracking attrs,
- `._peers` which maps the uids to client channels (with duplicates
  apparently..)
- the `._peer_connected: dict[tuple[str, str], trio.Event]` child-peer
  syncing table mostly used by parent actors to wait on sub's to connect
  back during spawn.
- the `._no_more_peers = trio.Event()` level triggered state signal.

Further we move over with some minor reworks,
- `.wait_for_peer()` verbatim (adjusting all dependants).
- factor the no-more-peers shielded wait branch-block out of
  the end of `async_main()` into 2 new server meths,
  * `.has_peers()` with optional chan-connected checking flag.
  * `.wait_for_no_more_peers()` which *just* does the
    maybe-shielded `._no_more_peers.wait()`
2025-07-08 18:05:05 -04:00
Tyler Goodlet 26fedec6a1 Mv `Actor._stream_handler()` to `.ipc._server` func
Call it `handle_stream_from_peer()` and bind in the `actor: Actor` via
a `handler=partial()` to `trio.serve_listeners()`.

With this (minus the `Actor._peers/._peer_connected/._no_more_peers`
attrs ofc) we get nearly full separation of IPC-connection-processing
(concerns) from `Actor` state. Thus it's a first look at modularizing
the low-level runtime into isolated subsystems which will hopefully
improve the entire code base's grok-ability and ease any new feature
design discussions especially pertaining to introducing and/or
composing-together any new transport protocols.
2025-07-08 18:05:05 -04:00
Tyler Goodlet 0711576678 Passthrough `_pause()` kwargs from `_maybe_enter_pm()` 2025-07-08 18:05:05 -04:00
Tyler Goodlet 0477a62ac3 Never hide non-[msgtype/tpt-closed] error tbs in `Channel.send()` 2025-07-08 18:05:05 -04:00
Tyler Goodlet 01d6f111f6 Use `current_ipc_protos()` as the `enable_transports`-default-when-`None`
Also ensure we assertion-error whenever the list is > 1 entry for now!
2025-07-08 18:05:05 -04:00
Tyler Goodlet 56ef4cba23 Add `_state.current_ipc_protos()`
For now just wrapping wtv the `._def_tpt_proto` per-actor setting is.
2025-07-08 18:05:05 -04:00
Tyler Goodlet 52b5efd78d Another `tn` eg-loosify inside `ActorNursery.cancel()`.. 2025-07-08 18:05:05 -04:00
Tyler Goodlet a7d4bcdfb9 Absorb `TransportClosed` in `Portal.cancel_actor()`
Just like we *were* for the `trio`-resource-errors it normally wraps
since we now also do the same wrapping in `MsgpackTransport.send()`
and we don't normally care to raise tpt-closure-errors on graceful actor
cancel requests.

Also, warn-report any non-tpt-closed low-level `trio` errors we haven't
yet re-wrapped (likely bc they haven't shown up).
2025-07-08 18:05:05 -04:00
Tyler Goodlet 79d0c17f6b Add `TransportClosed.from_src_exc()`
Such that re-wrapping/raising from a low-level `trio` resource error is
simpler and includes the `.src_exc` in the `__repr__()` and
`.message/.args` rendered at higher layers (like from `Channel` and
`._rpc` machinery).

Impl deats,
- mainly leverages packing in a new cls-method `.repr_src_exc() -> str:`
  repr of the underlying error before an optional `body: str` all as
  handled by the previously augmented `.pformat()`'s delegation to
  `pformat_exc()`.
- change `.src_exc` to be a property around a renamed `._src_exc`.

But wait, why?
- use it inside `MsgpackTransport.send()` to rewrap any
  `trio.BrokenResourceError`s so we always see the underlying
  `trio`-src-exc just like in the `.recv()._iter_packets()` handlers.
2025-07-08 18:05:05 -04:00
Tyler Goodlet 98c4614a36 Factor actor-embedded IPC-tpt-server to `ipc` subsys
Primarily moving the `Actor._serve_forever()`-task-as-method and
supporting actor-instance attributes to a new `.ipo._server` sub-mod
which now encapsulates,
- the coupling various `trio.Nursery`s (and their independent lifetime mgmt)
  to different `trio.serve_listener()`s tasks and `SocketStream`
  handler scopes.
- `Address` and `SocketListener` mgmt and tracking through the idea of
  an "IPC endpoint": each "bound-and-active instance" of a served-listener
  for some (varied transport protocol's socket) address.
- start and shutdown of the entire server's lifetime via an `@acm`.
- delegation of starting/stopping tpt-protocol-specific `trio.abc.Listener`s
  to the corresponding `.ipc._<proto_key>` sub-module (newly defined
  mod-top-level instead of `Address` method) `start/close_listener()`
  funcs.

Impl details of the `.ipc._server` sub-sys,
- add new `IPCServer`, allocated with `open_ipc_server()`, and which
  encapsulates starting multiple-transport-proto-`trio.abc.Listener`s
  from an input set of `._addr.Address`s using,
  |_`IPCServer.listen_on()` which internally spawns tasks that delegate to a new
    `_serve_ipc_eps()`, a rework of what was (effectively)
    `Actor._serve_forever()` and which now,
    * allocates a new `IPCEndpoint`-struct (see below) for each
      address-listener pair alongside the specified
      listener-serving/stream-handling `trio.Nursery`s provided by the
      caller.
    * starts and stops each transport (socket's) listener by calling
      `IPCEndpoint.start/close_listener()` which in turn delegates to
      the underlying `inspect.getmodule(IPCEndpoint.addr)` backend tpt
      module's equivalent impl.
    * tracks all created endpoints in a `._endpoints: list[IPCEndpoint]`
      which is further exposed through public properties for
      introspection of served transport-protocols and their addresses.
  |_`IPCServer._[parent/stream_handler]_tn: Nursery`s which are either
     allocated (in which case, as the same instance) or provided by the
     caller of `open_ipc_server()` such that the same nursery-cancel-scope
     controls offered by `trio.serve_listeners(handler_nursery=)` are
     offered where the `._parent_tn` is used to spawn `_serve_ipc_eps()`
     tasks, and `._stream_handler_tn` is passed verbatim as `handler_nursery`.
- a new `IPCEndpoint`-struct (as mentioned) which wraps each
  transport-proto's address + listener + allocated-supervising-nursery
  to encapsulate the "lifetime of a server IPC endpoint" such that
  eventually we can track and managed per-protocol/address/`.listen_on()`-call
  scoped starts/stops/restarts for the purposes of filtering/banning
  peer traffic.
  |_ also included is an unused `.peer_tpts` table which we can
    hopefully use to replace `Actor._peers` in a `Channel`-tracking
    transport-proto-aware way!

Surrounding changes to `.ipc.*` primitives to match,
- make `[TCP|UDS]Address` types `msgspec.Struct(frozen=True)` and thus
  drop any-and-all `addr._host =` style mutation throughout.
  |_ as such also drop their `.__init__()` and `.__eq__()` meths.
  |_ UDS tweaks to field names and thus `.__repr__()`.
- move `[TCP|UDS]Address.[start/close]_listener()` meths to be mod-level
  equiv `start|close_listener()` funcs.
- just hard code the `.ipc._types._key_to_transport/._addr_to_transport`
  table entries instead of all the prior fancy dynamic class property
  reading stuff (remember, "explicit is better then implicit").

Modified in `._runtime.Actor` internals,
- drop the `._serve_forever()` and `.cancel_server()`, methods and
  `._server_down` waiting logic from `.cancel_soon()`
- add `.[_]ipc_server` which is opened just after the `._service_n` and
  delegate to it for any equivalent publicly exposed instance
  attributes/properties.
2025-07-08 18:05:05 -04:00
Tyler Goodlet 61df10b333 Move concrete `Address`es to each tpt module
That is moving from `._addr`,
- `TCPAddress` to `.ipc._tcp`
- `UDSAddress` to `.ipc._uds`

Obviously this requires adjusting a buncha stuff in `._addr` to avoid
import cycles (the original reason the module was not also included in
the new `.ipc` subpkg) including,

- avoiding "unnecessary" imports of `[Unwrapped]Address` in various modules.
  * since `Address` is a protocol and the main point is that it **does
    not need to be inherited** per
    (https://typing.python.org/en/latest/spec/protocol.html#terminology)
    thus I removed the need for it in both transport submods.
  * and `UnwrappedAddress` is a type alias for tuples.. so we don't
    really always need to be importing it since it also kinda obfuscates
    what the underlying pairs are.
- not exporting everything in submods at the `.ipc` top level and
  importing from specific submods by default.
- only importing various types under a `if typing.TYPE_CHECKING:` guard
  as needed.
2025-07-08 18:05:05 -04:00
Tyler Goodlet 094447787e Add API-modernize-todo on `experimental._pubsub.fan_out_to_ctxs` 2025-07-08 18:05:05 -04:00
Tyler Goodlet ba45c03e14 Skip the ringbuf test mod for now since data-gen is a bit "heavy/laggy" atm 2025-07-08 18:05:05 -04:00
Tyler Goodlet 00d8a2a099 Improve `TransportClosed.__repr__()`, add `src_exc`
By borrowing from the implementation of `RemoteActorError.pformat()`
which is now factored into a new `.devx.pformat_exc()` and re-used for
both error types while maintaining the same func-sig. Obviously delegate
`RemoteActorError.pformat()` to the new helper accordingly and keeping
the prior `body` generation from `.devx.pformat_boxed_tb()` as before.

The new helper allows for,
- passing any of a `header|message|body: str` which are all combined in
  that order in the final output.
- getting the `exc.message` as the default `message` part.
- generating an objecty-looking "type-name" header to be rendered by
  default when `header` is not overridden.
- "first-line-of `message`" processing which we split-off and then
  re-inject as a `f'<{type(exc).__name__}( {first} )>'` top line header.
- an optional `tail: str = '>'` to "close the object"-look only added
  when `with_type_header: bool = True`.

Adjustments to `TransportClosed` around this include,
- replacing the init `cause` arg for a `src_exc` which is now always
  assigned to a same named instance var.
- displaying that new `.src_exc` in the `body: str` arg to the
  `.devx.pformat.pformat_exc()` call so you can always see the
  underlying (normally `trio`) source error.
- just make it inherit from `Exception` not `trio.BrokenResourceError`
  to avoid handlers catching `TransportClosed` as the former
  particularly in testing when we want to sometimes to distinguish them.
2025-07-08 18:05:05 -04:00
Tyler Goodlet bedde076d9 Unwrap `UDSAddress` as `tuple[str, str]`, i.e. sin pid
Since in hindsight the real analog of a net-proto's "bindspace"
(normally its routing layer's addresses-port-set) is more akin to the
"location in the file-system" for a UDS socket file (aka the file's
parent directory) determines whether or not the "port" (aka it's
file-name) collides with any other.

So the `._filedir: Path` is like the allocated "address" and,
the `._filename: Path|str` is basically the "port",

at least in my mind.. Bp

Thinking about fs dirs like a "host address" means you can get
essentially the same benefits/behaviour of say an (ip)
addresses-port-space but using the (current process-namespace's)
filesys-tree. Note that for UDS sockets in particular the
network-namespace is what would normally isolate so called "abstract
sockets" (i.e. UDS sockets that do NOT use file-paths by setting `struct
sockaddr_un.sun_path = 'abstract', see `man unix`); using directories is
even easier and definitely more explicit/readable/immediately-obvious as
a human-user.

As such this reworks all the necessary `UDSAddress` meths,
- `.unwrap()` now returns a `tuple(str(._filedir, str(._filename))`,
- `wrap_address()` now matches UDS on a 2nd tuple `str()` element,
- `.get_root()` no longer passes `maybe_pid`.

AND adjusts `MsgpackUDSStream` to,
- use the new `unwrap_sockpath()` on the `socket.get[sock/peer]name()`
  output before passing directly as `UDSAddress.__init__(filedir, filename)`
  instead of via `.from_addr()`.
- also pass `maybe_pid`s to init since no longer included in the
  unwrapped-type form.
2025-07-08 18:05:05 -04:00
Tyler Goodlet be1d8bf6fa s/`._addr.preferred_transport`/`_state._def_tpt_proto`
Such that the "global-ish" setting (actor-local) is managed with the
others per actor-process and type it as a `Literal['tcp', 'uds']` of the
currently support protocol keys.

Here obvi `_tpt` is some kinda shorthand for "transport" and `_proto` is
for "protocol" Bp

Change imports and refs in all dependent modules.

Oh right, and disable UDS in `wrap_address()` for the moment while
i figure out how to avoid the unwrapped type collision..
2025-07-08 18:05:04 -04:00
Tyler Goodlet d9aee98db2 Add `Arbiter.is_registry()` in prep for proper `.discovery._registry` 2025-07-08 18:05:04 -04:00
Tyler Goodlet 708ce4a051 Repair weird spawn test, start `test_root_runtime`
There was a very strange legacy test
`test_spawning.test_local_arbiter_subactor_global_state` which was
causing unforseen hangs/errors on the UDS tpt and looking deeper this
test was already doing root-actor things that should never have been
valid XD

So rework that test to properly demonstrate something of value
(i guess..) and add a new suite which start more rigorously auditing our
`open_root_actor()` permitted usage.

For the old test,
- since the main point of this test seemed to be the ability to invoke
  the same function in both the parent and child actor (using the very
  legacy `ActorNursery.run_in_actor()`.. due to be deprecated) rename it
  to `test_run_in_actor_same_func_in_child`,
- don't re-enter `.open_root_actor()` since that's invalid usage (tested
  in new suite see below),
- adjust some `spawn()` arg/var naming and ensure we only return in the
  child.

For the new suite add tests for,
- ensuring the implicit `open_root_actor()` call under `open_nursery()`.
- double open of `open_root_actor()` from within the same process tree
  both from a root and sub.

Intro some new `_exceptions` used in the new suite,
- a top level `RuntimeFailure` for generically expressing faults not of
  our own doing that prevent successful operation; this is what we now
  (changed in this commit) raise on attempts to open a 2nd root.
- mk `ActorFailure` derive from the former; it's already used from
  `._spawn` when subprocs fail to boot.
2025-07-08 18:05:04 -04:00
Tyler Goodlet d6d0112d95 Some more log message tweaks
- aggregate the `MsgStream.aclose()` "reader tasks" stats content into a
  common `message: str` before emit.
- tweak an `_rpc.process_messages()` emit per new `Channel.__repr__()`.
2025-07-08 18:05:04 -04:00
Tyler Goodlet 0fcbedd2be Change some low-hanging `.uid`s to `.aid`
Throughout `_context` and `_spawn` where it causes no big disruption.
Still lots to work out for things like how to pass `--uid
<tuple-as-str>` to spawned subactors and whether we want a diff name for
the minimum `tuple` required to distinguish a subactor pre-process-ID
allocation by the OS.
2025-07-08 18:05:04 -04:00
Tyler Goodlet 412c66d000 Mv to `Channel._do_handshake()` in `open_portal()`
As per the method migration in the last commit. Also adjust all `.uid`
usage to the new `.aid`.
2025-07-08 18:05:04 -04:00
Tyler Goodlet 3cc835c215 Mv `Actor._do_handshake()` to `Channel`, add `.aid`
Finally.. i've been meaning todo this for ages since the
actor-id-swap-as-handshake is better layered as part of the IPC msg-ing
machinery and then let's us encapsulate the connection-time-assignment
of a remote peer's `Aid` as a new `Channel.aid: Aid`. For now we
continue to offer the `.uid: tuple[str, str]` attr (by delegating to the
`.uid` field) since there's still a few things relying on it in the
runtime and ctx layers

Nice bonuses from this,
- it's very easy to get the peer's `Aid.pid: int` from anywhere in an
  IPC ctx by just reading it from the chan.
- we aren't saving more then the wire struct-msg received.

Also add deprecation warnings around usage to get us moving on porting
the rest of consuming runtime code to the new attr!
2025-07-08 18:05:04 -04:00
Tyler Goodlet f15bbb30cc UDS: translate file dne to connection-error
For the case where there's clearly no socket file created/bound
obviously the `trio.socket.connect()` call will raise
`FileNotFoundError`, so just translate this to
a builtin-`ConnectionError` at the transport layer so we can report the
guilty `UDSAddress`.
2025-07-08 18:05:04 -04:00
Tyler Goodlet ad211f8c2c More `._addr` boxing refinements
The more I think about it, it seems @guille's orig approach of
unwrapping UDS socket-file addresses to strings (or `Path`) is making
the most sense. I had originally thought that pairing it with the
listening side's pid would add clarity (and it definitely does for
introspection/debug/logging) but since we don't end up passing that pid
to the eventual `.connect()` call on the client side, it doesn't make
much sense to wrap it for the wire just to discard.. Further, the
`tuple[str, int]` makes `wrap_address()` break for TCP since it will
always match on uds first.

So, on that note this patch refines a few things in prep for going back
to that original `UnwrappedAddress` as `str` type though longer run
i think the more "builtin approach" would be to add `msgspec` codec
hooks for these types to avoid all the `.wrap()`/`.unwrap()` calls
throughout the runtime.

Down-low deats,
- add `wrap_address()` doc string, detailed (todo) comments and handle
  the `[None, None]` case that can come directly from
  `._state._runtime_vars['_root_mailbox']`.
- buncha adjustments to `UDSAddress`,
  - add a `filedir`, chng `filepath` -> `filename` and mk `maybe_pid` optional.
  - the intent `filedir` is act as the equivalent of the host part in a network proto's
    socket address and when it's null use the `.def_bindspace = get_rt_dir()`.
  - always ensure the `filedir / filename` is an absolute path and
    expose it as a new `.sockpath: Path` property.
  - mk `.is_valid` actually verify the `.sockpath` is in the valid
    `.bindspace: namely just checking it's in the expected dir.
  - add pedantic `match:`ing to `.from_addr()` such that we error on
    unexpected `type(addr)` inputs and otherwise parse any `sockpath:
    Path` inputs using a new `unwrap_sockpath()` which simply splits an
    abs file path to dir, file-name parts.
  - `.unwrap()` now just `str`-ifies the `.sockpath: Path`
  - adjust `.open/close_listener()` to use `.sockpath`.
2025-07-08 18:05:04 -04:00
Tyler Goodlet acac605c37 Move `DebugRequestError` to `._exceptions` 2025-07-08 18:05:04 -04:00
Tyler Goodlet 078e507774 Add `psutil` to `--dev` / testing deps 2025-07-08 12:57:29 -04:00
Tyler Goodlet 81bf810fbb Factor `breakpoint()` blocking into `@acm`
Call it `maybe_block_bp()` can wrap the `open_root_actor()` body with
it. Main reason is to guarantee we can bp inside actor runtime bootup as
needed when debugging internals! Prolly should factor this to another
module tho?

ALSO, ensure we RTE on recurrent entries to `open_root_actor()` from
within an existing tree! There was actually `test_spawning` test somehow
getting away with this!? Should never be possible or allowed!
2025-07-08 12:57:29 -04:00
Tyler Goodlet 7d1512e03a Add an `Actor.pformat()`
And map `.__repr__/__str__` to it and add various new fields to fill it
out,
- drop `self.uid` as var and instead add `Actor._aid: Aid` and proxy to
  it for the various `.name/.uid/.pid` properties as well as a new
  `.aid` field.
 |_ the `Aid.pid` addition is also included.

Other improvements,
- flip to a sync call to `Address.close_listener()`.
- track the `async_main()` parent task as `Actor._task`.
- add exception logging around failure to bind due to already-in-use
  when calling `add.open_listener()` in `._stream_forever()`; sometimes
  the error might be overridden by something else during the
  runtime-failure unwind..
2025-07-08 12:57:29 -04:00
Tyler Goodlet 1c85338ff8 Add a `MsgpackTransport.pformat()`
And map `.__repr__/__str__` to it. Also adjust to new
`Address.proto_key` and add a #TODO for a `.get_peers()`.
2025-07-08 12:57:29 -04:00
Tyler Goodlet 7a3c9d0458 Even more `tractor._addr.Address` simplifying
Namely reducing the duplication of class-fields and `TypeVar`s used
for parametrizing the `Address` protocol type,
- drop all of the `TypeVar` types and just stick with all concrete addrs
  types inheriting from `Address` only.
- rename `Address.name_key` -> `.proto_key`.
- rename `Address.address_type` -> `.unwrapped_type`
- rename `.namespace` -> `.bindspace` to better reflect that this "part"
  of the address represents the possible "space for binding endpoints".
 |_ also linux already uses "namespace" to mean the `netns` and i'd
   prefer to stick with their semantics for that.
- add `TCPAddress/UDSAddress.def_bindspace` values.
- drop commented `.open_stream()` method; never used.
- simplify `UnwrappedAdress` to just a `tuple` of union types.
- add logging to `USDAddress.open_listener()` for now.
- adjust `tractor.ipc/_uds/tcp` transport to use new addr field names.
2025-07-08 12:57:29 -04:00
Tyler Goodlet 31196b9cb4 Handle broken-pipes from `MsgpackTransport.send()`
Much like we already do in the `._iter_packets()` async-generator which
delivers to `.recv()` and `async for`, handle the `''[Errno 32] Broken
pipe'` case that can show up with unix-domain-socket usage.

Seems like the cause is due to how fast the socket can be torn down
during a registry addr channel ping where,
- the sending side can break the connection faster then the pong side
  can prep its handshake msg,
- the pong side tries to send it's handshake pkt via
  `.SocketStream.send_all()` after the breakage and then raises
  `trio.BrokenResourceError`.
2025-07-08 12:57:28 -04:00
Tyler Goodlet 44c9da1c91 Emphasize internal error block header-comment a bit 2025-07-08 12:57:28 -04:00
Tyler Goodlet b4ce618e33 Bit of multi-line styling for `LocalPortal` 2025-07-08 12:57:28 -04:00
Tyler Goodlet a504d92536 Adjust `._child` instantiation of `Actor` to use newly named `uuid` arg 2025-07-08 12:57:28 -04:00
Tyler Goodlet 8c0d9614bc Add `bidict` pkg as dep since used in `._addr` for now 2025-07-08 12:57:28 -04:00
Tyler Goodlet a6fefcc2a8 Adjust lowlevel-tb hiding logic for `MsgStream`
Such that whenev the `self._ctx.chan._exc is trans_err` we suppress.
I.e. when the `Channel._exc: Exception|None` error **is the same as**
set by the `._rpc.process_messages()` loop (that is, set to the
underlying transport layer error), we suppress the lowlevel tb,
otherwise we deliver the full tb since likely something at the lowlevel
that we aren't detecting changed/signalled/is-relevant!
2025-07-08 12:57:28 -04:00
Tyler Goodlet abdaf7bf1f Slight typing and multi-line styling tweaks in `.ipc` sugpkg 2025-07-08 12:57:28 -04:00
Tyler Goodlet 7b3324b240 Add a big boi `Channel.pformat()/__repr__()`
Much like how `Context` has been implemented, try to give tons of high
level details on all the lower level encapsulated primitives, namely the
`.msgstream/.transport` and any useful runtime state.

B)

Impl deats,
- adjust `.from_addr()` to only call `._addr.wrap_address()` when we
  detect `addr` is unwrapped.
- add another `log.runtime()` using the new `.__repr__()` in
  `Channel.from_addr()`.
- change to `UnwrappedAddress` as in prior commits.
2025-07-08 12:57:28 -04:00
Tyler Goodlet bbae2c91fd Allocate bind-addrs in subactors
Previously whenever an `ActorNursery.start_actor()` call did not receive
a `bind_addrs` arg we would allocate the default `(localhost, 0)` pairs
in the parent, for UDS this obviously won't work nor is it ideal bc it's
nicer to have the actor to be a socket server (who calls
`Address.open_listener()`) define the socket-file-name containing their
unique ID info such as pid, actor-uuid etc.

As such this moves "random" generation of server addresses to the
child-side of a subactor's spawn-sequence when it's sin-`bind_addrs`;
i.e. we do the allocation of the `Address.get_random()` addrs inside
`._runtime.async_main()` instead of `Portal.start_actor()` and **only
when** `accept_addrs`/`bind_addrs` was **not provided by the spawning
parent**.

Further this patch get's way more rigorous about the `SpawnSpec`
processing in the child inside `Actor._from_parent()` such that we
handle any invalid msgs **very loudly and pedantically!**

Impl deats,
- do the "random addr generation" in an explicit `for` loop (instead of
  prior comprehension) to allow for more detailed typing of the layered
  calls to the new `._addr` mod.
- use a `match:/case:` for process any invalid `SpawnSpec` payload case
  where we can instead receive a `MsgTypeError` from the `chan.recv()`
  call in `Actor._from_parent()` to raise it immediately instead of
  triggering downstream type-errors XD
  |_ as per the big `#TODO` we prolly want to take from other callers
     of `Channel.recv()` (like in the `._rpc.process_messages()` loop).
  |_ always raise `InternalError` on non-match/fall-through case!
  |_ add a note about not being able to use `breakpoint()` in this
     section due to causality of `SpawnSpec._runtime_vars` not having
     been processed yet..
  |_ always return a third element from `._from_rent()` eventually to be
     the `preferred_transports: list[str]` from the spawning rent.
- use new `._addr.mk_uuid()` and pass to new `Actor.__init__(uuid: str)`
  for all actor creation (including in all the mods tweaked here).
- Move to new type-alias-name `UnwrappedAddress` throughout.
2025-07-08 12:57:28 -04:00
Tyler Goodlet 2540d1f9e0 Adjust imports to use new `UnwrappedAddress`
For those mods where it's just a type-alias (name) import change.
2025-07-08 12:57:28 -04:00
Tyler Goodlet 63fac5a809 Implement peer-info tracking for UDS streams
Such that any UDS socket pair is represented (and with the recent
updates to) a `USDAddress` via a similar pair-`tuple[str, int]` as TCP
sockets, a pair of the `.filepath: Path` & the peer proc's `.pid: int`
which we read from the underlying `socket.socket` using
`.set/getsockopt()` calls

Impl deats,
- using the Linux specific APIs, we add a `get_peer_info()` which reads
  the `(pid, uid, gid)` using the `SOL_SOCKET` and `SOL_PEECRED` opts to
  `sock.getsockopt()`.
  |_ this presumes the client has been correspondingly configured to
     deliver the creds via a `sock.setsockopt(SOL_SOCKET, SO_PASSCRED,
     1)` call - this required us to override `trio.open_unix_socket()`.
- override `trio.open_unix_socket()` as per the above bullet to ensure
  connecting peers always transmit "credentials" options info to the
  listener.
- update `.get_stream_addrs()` to always call `get_peer_info()` and
  extract the peer's pid for the `raddr` and use `os.getpid()` for
  `laddr` (obvi).
  |_ as part of the new impl also `log.info()` the creds-info deats and
    socket-file path.
  |_ handle the oddity where it depends which of `.getpeername()` or
    `.getsockname()` will return the file-path; i think it's to do with
    who is client vs. server?

Related refinements,
- set `.layer_key: int = 4` for the "transport layer" ;)
- tweak some typing and multi-line unpacking in `.ipc/_tcp`.
2025-07-08 12:57:28 -04:00
Tyler Goodlet 568fb18d01 Rework/simplify transport addressing
A few things that can fundamentally change,

- UDS addresses now always encapsulate the local and remote pid such
  that it denotes each side's process much like a TCP *port*.
  |_ `.__init__()` takes a new `maybe_pid: int`.
  |_ this required changes to the `.ipc._uds` backend which will come in
     an subsequent commit!
  |_ `UDSAddress.address_type` becomes a `tuple[str, int]` just like the
      TCP case.
  |_ adjust `wrap_address()` to match.
- use a new `_state.get_rt_dir() -> Path` as the default location for
  UDS socket file: now under `XDG_RUNTIME_DIR'/tractor/` subdir by
  default.
- re-implement `USDAddress.get_random()` to use both the local
  `Actor.uid` (if available) and at least the pid for its socket file
  name.

Removals,
- drop the loop generated `_default_addrs`, simplify to just
  `_default_lo_addrs` for per-transport default registry addresses.
  |_ change to `_address_types: dict[str, Type[Address]]` instead of
     separate types `list`.
  |_ adjust `is_wrapped_addr()` to just check `in _addr_types.values()`.
- comment out `Address.open_stream()` it's unused and i think the wrong
  place for this API.

Renames,
- from `AddressTypes` -> `UnwrappedAddress`, since it's a simple type
  union and all this type set is, is the simple python data-structures
  we encode to for the wire.
  |_ see note about possibly implementing the `.[un]wrap()` stuff as
     `msgspec` codec `enc/dec_hook()`s instead!

Additions,
- add a `mk_uuid()` to be used throughout the runtime including for
  generating the `Aid.uuid` part.
- tons of notes around follow up refinements!
2025-07-08 12:57:28 -04:00
Guillermo Rodriguez f67e19a852 Trying to make full suite pass with uds 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 0be9f5f907 Finally switch to using address protocol in all runtime 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 5e2d456029 Add root and random addr getters on MsgTransport type 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez c7d5b021db Starting to make `.ipc.Channel` work with multiple MsgTransports 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 6f1f198fb1 Break out transport protocol and tcp specifics into their own submodules under tractor.ipc 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 26fef82d33 Add buf_size to RBToken and add sender cancel test, move disable_mantracker to its own _mp_bs module 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 84d25b5727 Make ring buf api use pickle-able RBToken 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 1ed0c861b5 Address some of fomo\'s comments 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 2dd3a682c8 Handle cancelation on EventFD.read 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 881813e61e Add module headers and fix spacing on tractor._ipc._linux 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 566a11c00d Move RingBuffSender|Receiver to its own tractor.ipc._ringbuf module 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez af69272d16 Move linux specifics from tractor.ipc._shm into tractor.ipc._linux 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 8e3f581d3f Move tractor._shm to tractor.ipc._shm 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez eceb292415 move tractor._ipc.py into tractor.ipc._chan.py 2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 9921ea3cae General improvements
EventFD class now expects the fd to already be init with open_eventfd
RingBuff Sender and Receiver fully manage SharedMemory and EventFD lifecycles, no aditional ctx mngrs needed
Separate ring buf tests into its own test bed
Add parametrization to test and cancellation
Add docstrings
Add simple testing data gen module .samples
2025-07-08 12:57:28 -04:00
Guillermo Rodriguez 414a8c5b75 IPC ring bug impl with async read 2025-07-08 12:57:28 -04:00
Tyler Goodlet eeb0516017 Merge branch 'gitea/main' into 'github/main'
Synchronizing the "main", previously (and less woke-ly/succinctly)
called "master", branch between our `gitea` remote and the current
`github` tip.

* main:
  Only set shield flag when trio nursery mode is used
  Disable parent channel append on get_peer_by_name to_scan
2025-06-19 19:51:03 -04:00
goodboy d6eeddef4e
Merge pull request #338 from goodboy/shm_apis
Shared memory array API and optional tight integration with `numpy`

Landing this so many downstream major feature branches depend on it namely,
- https://github.com/goodboy/tractor/pull/375
- https://github.com/goodboy/tractor/pull/376
- and the eventual #10
2025-04-25 23:20:46 -04:00
guille d478dbfcfe Merge pull request 'Fix to trionics helper `maybe_open_nursery`' (#26) from maybe_open_nursery_fix into main
Reviewed-on: #26
2025-04-13 20:58:47 +00:00
Guillermo Rodriguez ef6094a650
Only set shield flag when trio nursery mode is used 2025-04-13 17:57:37 -03:00
guille 4e8404bb09 Merge pull request 'Duplicated channel on `Actor._peers` causes hang on `portal.cancel_actor()`' (#25) from discovery_dedup into main
Reviewed-on: #25
2025-04-13 20:53:23 +00:00
Guillermo Rodriguez bbb3484ae9
Disable parent channel append on get_peer_by_name to_scan 2025-04-13 14:06:03 -03:00
104 changed files with 14971 additions and 6853 deletions

View File

@ -8,46 +8,70 @@ on:
workflow_dispatch:
jobs:
mypy:
name: 'MyPy'
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Setup python
uses: actions/setup-python@v2
with:
python-version: '3.11'
- name: Install dependencies
run: pip install -U . --upgrade-strategy eager -r requirements-test.txt
- name: Run MyPy check
run: mypy tractor/ --ignore-missing-imports --show-traceback
# ------ sdist ------
# test that we can generate a software distribution and install it
# thus avoid missing file issues after packaging.
#
# -[x] produce sdist with uv
# ------ - ------
sdist-linux:
name: 'sdist'
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
uses: actions/checkout@v4
- name: Setup python
uses: actions/setup-python@v2
with:
python-version: '3.11'
- name: Install latest uv
uses: astral-sh/setup-uv@v6
- name: Build sdist
run: python setup.py sdist --formats=zip
- name: Build sdist as tar.gz
run: uv build --sdist --python=3.13
- name: Install sdist from .zips
run: python -m pip install dist/*.zip
- name: Install sdist from .tar.gz
run: python -m pip install dist/*.tar.gz
# ------ type-check ------
# mypy:
# name: 'MyPy'
# runs-on: ubuntu-latest
# steps:
# - name: Checkout
# uses: actions/checkout@v4
# - name: Install latest uv
# uses: astral-sh/setup-uv@v6
# # faster due to server caching?
# # https://docs.astral.sh/uv/guides/integration/github/#setting-up-python
# - name: "Set up Python"
# uses: actions/setup-python@v6
# with:
# python-version-file: "pyproject.toml"
# # w uv
# # - name: Set up Python
# # run: uv python install
# - name: Setup uv venv
# run: uv venv .venv --python=3.13
# - name: Install
# run: uv sync --dev
# # TODO, ty cmd over repo
# # - name: type check with ty
# # run: ty ./tractor/
# # - uses: actions/cache@v3
# # name: Cache uv virtenv as default .venv
# # with:
# # path: ./.venv
# # key: venv-${{ hashFiles('uv.lock') }}
# - name: Run MyPy check
# run: mypy tractor/ --ignore-missing-imports --show-traceback
testing-linux:
@ -59,32 +83,45 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest]
python: ['3.11']
python-version: ['3.13']
spawn_backend: [
'trio',
'mp_spawn',
'mp_forkserver',
# 'mp_spawn',
# 'mp_forkserver',
]
steps:
- name: Checkout
uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Setup python
uses: actions/setup-python@v2
- name: 'Install uv + py-${{ matrix.python-version }}'
uses: astral-sh/setup-uv@v6
with:
python-version: '${{ matrix.python }}'
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager
# GH way.. faster?
# - name: setup-python@v6
# uses: actions/setup-python@v6
# with:
# python-version: '${{ matrix.python-version }}'
- name: List dependencies
run: pip list
# consider caching for speedups?
# https://docs.astral.sh/uv/guides/integration/github/#caching
- name: Install the project w uv
run: uv sync --all-extras --dev
# - name: Install dependencies
# run: pip install -U . -r requirements-test.txt -r requirements-docs.txt --upgrade-strategy eager
- name: List deps tree
run: uv tree
- name: Run tests
run: pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rsx
run: uv run pytest tests/ --spawn-backend=${{ matrix.spawn_backend }} -rsx
# XXX legacy NOTE XXX
#
# We skip 3.10 on windows for now due to not having any collabs to
# debug the CI failures. Anyone wanting to hack and solve them is very
# welcome, but our primary user base is not using that OS.

19
default.nix 100644
View File

@ -0,0 +1,19 @@
{ pkgs ? import <nixpkgs> {} }:
let
nativeBuildInputs = with pkgs; [
stdenv.cc.cc.lib
uv
];
in
pkgs.mkShell {
inherit nativeBuildInputs;
LD_LIBRARY_PATH = pkgs.lib.makeLibraryPath nativeBuildInputs;
TMPDIR = "/tmp";
shellHook = ''
set -e
uv venv .venv --python=3.12
'';
}

View File

@ -1,8 +1,5 @@
|logo| ``tractor``: distributed structurred concurrency
|gh_actions|
|docs|
``tractor`` is a `structured concurrency`_ (SC), multi-processing_ runtime built on trio_.
Fundamentally, ``tractor`` provides parallelism via
@ -66,6 +63,13 @@ Features
- (WIP) a ``TaskMngr``: one-cancels-one style nursery supervisor.
Status of `main` / infra
------------------------
- |gh_actions|
- |docs|
Install
-------
``tractor`` is still in a *alpha-near-beta-stage* for many
@ -689,9 +693,11 @@ channel`_!
.. _msgspec: https://jcristharif.com/msgspec/
.. _guest: https://trio.readthedocs.io/en/stable/reference-lowlevel.html?highlight=guest%20mode#using-guest-mode-to-run-trio-on-top-of-other-event-loops
.. |gh_actions| image:: https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fgoodboy%2Ftractor%2Fbadge&style=popout-square
:target: https://actions-badge.atrox.dev/goodboy/tractor/goto
..
NOTE, on generating badge links from the UI
https://docs.github.com/en/actions/how-tos/monitoring-and-troubleshooting-workflows/monitoring-workflows/adding-a-workflow-status-badge?ref=gitguardian-blog-automated-secrets-detection#using-the-ui
.. |gh_actions| image:: https://github.com/goodboy/tractor/actions/workflows/ci.yml/badge.svg?branch=main
:target: https://github.com/goodboy/tractor/actions/workflows/ci.yml
.. |docs| image:: https://readthedocs.org/projects/tractor/badge/?version=latest
:target: https://tractor.readthedocs.io/en/latest/?badge=latest

View File

@ -16,6 +16,7 @@ from tractor import (
ContextCancelled,
MsgStream,
_testing,
trionics,
)
import trio
import pytest
@ -62,9 +63,8 @@ async def recv_and_spawn_net_killers(
await ctx.started()
async with (
ctx.open_stream() as stream,
trio.open_nursery(
strict_exception_groups=False,
) as tn,
trionics.collapse_eg(),
trio.open_nursery() as tn,
):
async for i in stream:
print(f'child echoing {i}')
@ -120,6 +120,7 @@ async def main(
break_parent_ipc_after: int|bool = False,
break_child_ipc_after: int|bool = False,
pre_close: bool = False,
tpt_proto: str = 'tcp',
) -> None:
@ -131,6 +132,7 @@ async def main(
# a hang since it never engages due to broken IPC
debug_mode=debug_mode,
loglevel=loglevel,
enable_transports=[tpt_proto],
) as an,
):
@ -145,7 +147,8 @@ async def main(
_testing.expect_ctxc(
yay=(
break_parent_ipc_after
or break_child_ipc_after
or
break_child_ipc_after
),
# TODO: we CAN'T remove this right?
# since we need the ctxc to bubble up from either

View File

@ -29,7 +29,7 @@ async def bp_then_error(
to_trio.send_nowait('start')
# NOTE: what happens here inside the hook needs some refinement..
# => seems like it's still `._debug._set_trace()` but
# => seems like it's still `.debug._set_trace()` but
# we set `Lock.local_task_in_debug = 'sync'`, we probably want
# some further, at least, meta-data about the task/actor in debug
# in terms of making it clear it's `asyncio` mucking about.

View File

@ -4,6 +4,11 @@ import sys
import trio
import tractor
# ensure mod-path is correct!
from tractor.devx.debug import (
_sync_pause_from_builtin as _sync_pause_from_builtin,
)
async def main() -> None:
@ -13,19 +18,23 @@ async def main() -> None:
async with tractor.open_nursery(
debug_mode=True,
) as an:
assert an
loglevel='devx',
maybe_enable_greenback=True,
# ^XXX REQUIRED to enable `breakpoint()` support (from sync
# fns) and thus required here to avoid an assertion err
# on the next line
):
assert (
(pybp_var := os.environ['PYTHONBREAKPOINT'])
==
'tractor.devx._debug._sync_pause_from_builtin'
'tractor.devx.debug._sync_pause_from_builtin'
)
# TODO: an assert that verifies the hook has indeed been, hooked
# XD
assert (
(pybp_hook := sys.breakpointhook)
is not tractor.devx._debug._set_trace
is not tractor.devx.debug._set_trace
)
print(

View File

@ -24,10 +24,9 @@ async def spawn_until(depth=0):
async def main():
"""The main ``tractor`` routine.
The process tree should look as approximately as follows when the debugger
first engages:
'''
The process tree should look as approximately as follows when the
debugger first engages:
python examples/debugging/multi_nested_subactors_bp_forever.py
python -m tractor._child --uid ('spawner1', '7eab8462 ...)
@ -37,10 +36,11 @@ async def main():
python -m tractor._child --uid ('spawner0', '1d42012b ...)
python -m tractor._child --uid ('name_error', '6c2733b8 ...)
"""
'''
async with tractor.open_nursery(
debug_mode=True,
loglevel='warning'
loglevel='devx',
enable_transports=['uds'],
) as n:
# spawn both actors

View File

@ -0,0 +1,35 @@
import trio
import tractor
async def main():
async with tractor.open_root_actor(
debug_mode=True,
loglevel='cancel',
) as _root:
# manually trigger self-cancellation and wait
# for it to fully trigger.
_root.cancel_soon()
await _root._cancel_complete.wait()
print('root cancelled')
# now ensure we can still use the REPL
try:
await tractor.pause()
except trio.Cancelled as _taskc:
assert (root_cs := _root._root_tn.cancel_scope).cancel_called
# NOTE^^ above logic but inside `open_root_actor()` and
# passed to the `shield=` expression is effectively what
# we're testing here!
await tractor.pause(shield=root_cs.cancel_called)
# XXX, if shield logic *is wrong* inside `open_root_actor()`'s
# crash-handler block this should never be interacted,
# instead `trio.Cancelled` would be bubbled up: the original
# BUG.
assert 0
if __name__ == '__main__':
trio.run(main)

View File

@ -37,6 +37,7 @@ async def main(
enable_stack_on_sig=True,
# maybe_enable_greenback=False,
loglevel='devx',
enable_transports=['uds'],
) as an,
):
ptl: tractor.Portal = await an.start_actor(

View File

@ -33,8 +33,11 @@ async def just_bp(
async def main():
async with tractor.open_nursery(
debug_mode=True,
enable_transports=['uds'],
loglevel='devx',
) as n:
p = await n.start_actor(
'bp_boi',

View File

@ -6,7 +6,7 @@ import tractor
# TODO: only import these when not running from test harness?
# can we detect `pexpect` usage maybe?
# from tractor.devx._debug import (
# from tractor.devx.debug import (
# get_lock,
# get_debug_req,
# )

View File

@ -23,9 +23,8 @@ async def main():
modules=[__name__]
) as portal_map,
trio.open_nursery(
strict_exception_groups=False,
) as tn,
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
):
for (name, portal) in portal_map.items():

View File

@ -9,7 +9,7 @@ async def main(service_name):
async with tractor.open_nursery() as an:
await an.start_actor(service_name)
async with tractor.get_registry('127.0.0.1', 1616) as portal:
async with tractor.get_registry() as portal:
print(f"Arbiter is listening on {portal.channel}")
async with tractor.wait_for_actor(service_name) as sockaddr:

View File

@ -45,6 +45,8 @@ dependencies = [
"pdbp>=1.6,<2", # windows only (from `pdbp`)
# typed IPC msging
"msgspec>=0.19.0",
"cffi>=1.17.1",
"bidict>=0.23.1",
]
# ------ project ------
@ -59,9 +61,13 @@ dev = [
# `tractor.devx` tooling
"greenback>=1.2.1,<2",
"stackscope>=0.2.2,<0.3",
# ^ requires this?
"typing-extensions>=4.14.1",
"pyperclip>=1.9.0",
"prompt-toolkit>=3.0.50",
"xonsh>=0.19.2",
"psutil>=7.0.0",
]
# TODO, add these with sane versions; were originally in
# `requirements-docs.txt`..

View File

@ -1,24 +1,27 @@
"""
``tractor`` testing!!
Top level of the testing suites!
"""
from __future__ import annotations
import sys
import subprocess
import os
import random
import signal
import platform
import time
import pytest
import tractor
from tractor._testing import (
examples_dir as examples_dir,
tractor_test as tractor_test,
expect_ctxc as expect_ctxc,
)
# TODO: include wtv plugin(s) we build in `._testing.pytest`?
pytest_plugins = ['pytester']
pytest_plugins: list[str] = [
'pytester',
'tractor._testing.pytest',
]
# Sending signal.SIGINT on subprocess fails on windows. Use CTRL_* alternatives
if platform.system() == 'Windows':
@ -30,7 +33,11 @@ else:
_KILL_SIGNAL = signal.SIGKILL
_INT_SIGNAL = signal.SIGINT
_INT_RETURN_CODE = 1 if sys.version_info < (3, 8) else -signal.SIGINT.value
_PROC_SPAWN_WAIT = 0.6 if sys.version_info < (3, 7) else 0.4
_PROC_SPAWN_WAIT = (
0.6
if sys.version_info < (3, 7)
else 0.4
)
no_windows = pytest.mark.skipif(
@ -39,7 +46,12 @@ no_windows = pytest.mark.skipif(
)
def pytest_addoption(parser):
def pytest_addoption(
parser: pytest.Parser,
):
# ?TODO? should this be exposed from our `._testing.pytest`
# plugin or should we make it more explicit with `--tl` for
# tractor logging like we do in other client projects?
parser.addoption(
"--ll",
action="store",
@ -47,42 +59,10 @@ def pytest_addoption(parser):
default='ERROR', help="logging level to set when testing"
)
parser.addoption(
"--spawn-backend",
action="store",
dest='spawn_backend',
default='trio',
help="Processing spawning backend to use for test run",
)
parser.addoption(
"--tpdb", "--debug-mode",
action="store_true",
dest='tractor_debug_mode',
# default=False,
help=(
'Enable a flag that can be used by tests to to set the '
'`debug_mode: bool` for engaging the internal '
'multi-proc debugger sys.'
),
)
def pytest_configure(config):
backend = config.option.spawn_backend
tractor._spawn.try_set_start_method(backend)
@pytest.fixture(scope='session')
def debug_mode(request):
debug_mode: bool = request.config.option.tractor_debug_mode
# if debug_mode:
# breakpoint()
return debug_mode
@pytest.fixture(scope='session', autouse=True)
def loglevel(request):
import tractor
orig = tractor.log._default_loglevel
level = tractor.log._default_loglevel = request.config.option.loglevel
tractor.log.get_console_log(level)
@ -90,106 +70,44 @@ def loglevel(request):
tractor.log._default_loglevel = orig
@pytest.fixture(scope='session')
def spawn_backend(request) -> str:
return request.config.option.spawn_backend
# @pytest.fixture(scope='function', autouse=True)
# def debug_enabled(request) -> str:
# from tractor import _state
# if _state._runtime_vars['_debug_mode']:
# breakpoint()
_ci_env: bool = os.environ.get('CI', False)
@pytest.fixture(scope='session')
def ci_env() -> bool:
'''
Detect CI envoirment.
Detect CI environment.
'''
return _ci_env
# TODO: also move this to `._testing` for now?
# -[ ] possibly generalize and re-use for multi-tree spawning
# along with the new stuff for multi-addrs in distribute_dis
# branch?
#
# choose randomly at import time
_reg_addr: tuple[str, int] = (
'127.0.0.1',
random.randint(1000, 9999),
)
@pytest.fixture(scope='session')
def reg_addr() -> tuple[str, int]:
# globally override the runtime to the per-test-session-dynamic
# addr so that all tests never conflict with any other actor
# tree using the default.
from tractor import _root
_root._default_lo_addrs = [_reg_addr]
return _reg_addr
def pytest_generate_tests(metafunc):
spawn_backend = metafunc.config.option.spawn_backend
if not spawn_backend:
# XXX some weird windows bug with `pytest`?
spawn_backend = 'trio'
# TODO: maybe just use the literal `._spawn.SpawnMethodKey`?
assert spawn_backend in (
'mp_spawn',
'mp_forkserver',
'trio',
)
# NOTE: used to be used to dyanmically parametrize tests for when
# you just passed --spawn-backend=`mp` on the cli, but now we expect
# that cli input to be manually specified, BUT, maybe we'll do
# something like this again in the future?
if 'start_method' in metafunc.fixturenames:
metafunc.parametrize("start_method", [spawn_backend], scope='module')
# TODO: a way to let test scripts (like from `examples/`)
# guarantee they won't registry addr collide!
# @pytest.fixture
# def open_test_runtime(
# reg_addr: tuple,
# ) -> AsyncContextManager:
# return partial(
# tractor.open_nursery,
# registry_addrs=[reg_addr],
# )
def sig_prog(proc, sig):
def sig_prog(
proc: subprocess.Popen,
sig: int,
canc_timeout: float = 0.1,
) -> int:
"Kill the actor-process with ``sig``."
proc.send_signal(sig)
time.sleep(0.1)
time.sleep(canc_timeout)
if not proc.poll():
# TODO: why sometimes does SIGINT not work on teardown?
# seems to happen only when trace logging enabled?
proc.send_signal(_KILL_SIGNAL)
ret = proc.wait()
ret: int = proc.wait()
assert ret
# TODO: factor into @cm and move to `._testing`?
@pytest.fixture
def daemon(
debug_mode: bool,
loglevel: str,
testdir,
testdir: pytest.Pytester,
reg_addr: tuple[str, int],
):
tpt_proto: str,
) -> subprocess.Popen:
'''
Run a daemon root actor as a separate actor-process tree and
"remote registrar" for discovery-protocol related tests.
@ -200,28 +118,100 @@ def daemon(
loglevel: str = 'info'
code: str = (
"import tractor; "
"tractor.run_daemon([], registry_addrs={reg_addrs}, loglevel={ll})"
"import tractor; "
"tractor.run_daemon([], "
"registry_addrs={reg_addrs}, "
"debug_mode={debug_mode}, "
"loglevel={ll})"
).format(
reg_addrs=str([reg_addr]),
ll="'{}'".format(loglevel) if loglevel else None,
debug_mode=debug_mode,
)
cmd: list[str] = [
sys.executable,
'-c', code,
]
# breakpoint()
kwargs = {}
if platform.system() == 'Windows':
# without this, tests hang on windows forever
kwargs['creationflags'] = subprocess.CREATE_NEW_PROCESS_GROUP
proc = testdir.popen(
proc: subprocess.Popen = testdir.popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
**kwargs,
)
assert not proc.returncode
# UDS sockets are **really** fast to bind()/listen()/connect()
# so it's often required that we delay a bit more starting
# the first actor-tree..
if tpt_proto == 'uds':
global _PROC_SPAWN_WAIT
_PROC_SPAWN_WAIT = 0.6
time.sleep(_PROC_SPAWN_WAIT)
assert not proc.returncode
yield proc
sig_prog(proc, _INT_SIGNAL)
# XXX! yeah.. just be reaaal careful with this bc sometimes it
# can lock up on the `_io.BufferedReader` and hang..
stderr: str = proc.stderr.read().decode()
if stderr:
print(
f'Daemon actor tree produced STDERR:\n'
f'{proc.args}\n'
f'\n'
f'{stderr}\n'
)
if proc.returncode != -2:
raise RuntimeError(
'Daemon actor tree failed !?\n'
f'{proc.args}\n'
)
# @pytest.fixture(autouse=True)
# def shared_last_failed(pytestconfig):
# val = pytestconfig.cache.get("example/value", None)
# breakpoint()
# if val is None:
# pytestconfig.cache.set("example/value", val)
# return val
# TODO: a way to let test scripts (like from `examples/`)
# guarantee they won't `registry_addrs` collide!
# -[ ] maybe use some kinda standard `def main()` arg-spec that
# we can introspect from a fixture that is called from the test
# body?
# -[ ] test and figure out typing for below prototype! Bp
#
# @pytest.fixture
# def set_script_runtime_args(
# reg_addr: tuple,
# ) -> Callable[[...], None]:
# def import_n_partial_in_args_n_triorun(
# script: Path, # under examples?
# **runtime_args,
# ) -> Callable[[], Any]: # a `partial`-ed equiv of `trio.run()`
# # NOTE, below is taken from
# # `.test_advanced_faults.test_ipc_channel_break_during_stream`
# mod: ModuleType = import_path(
# examples_dir() / 'advanced_faults'
# / 'ipc_failure_during_stream.py',
# root=examples_dir(),
# consider_namespace_packages=False,
# )
# return partial(
# trio.run,
# partial(
# mod.main,
# **runtime_args,
# )
# )
# return import_n_partial_in_args_n_triorun

View File

@ -2,9 +2,11 @@
`tractor.devx.*` tooling sub-pkg test space.
'''
from __future__ import annotations
import time
from typing import (
Callable,
TYPE_CHECKING,
)
import pytest
@ -16,7 +18,7 @@ from pexpect.spawnbase import SpawnBase
from tractor._testing import (
mk_cmd,
)
from tractor.devx._debug import (
from tractor.devx.debug import (
_pause_msg as _pause_msg,
_crash_msg as _crash_msg,
_repl_fail_msg as _repl_fail_msg,
@ -26,14 +28,22 @@ from ..conftest import (
_ci_env,
)
if TYPE_CHECKING:
from pexpect import pty_spawn
# a fn that sub-instantiates a `pexpect.spawn()`
# and returns it.
type PexpectSpawner = Callable[[str], pty_spawn.spawn]
@pytest.fixture
def spawn(
start_method,
start_method: str,
testdir: pytest.Pytester,
reg_addr: tuple[str, int],
) -> Callable[[str], None]:
) -> PexpectSpawner:
'''
Use the `pexpect` module shipped via `testdir.spawn()` to
run an `./examples/..` script by name.
@ -59,7 +69,7 @@ def spawn(
def _spawn(
cmd: str,
**mkcmd_kwargs,
):
) -> pty_spawn.spawn:
unset_colors()
return testdir.spawn(
cmd=mk_cmd(
@ -73,7 +83,7 @@ def spawn(
)
# such that test-dep can pass input script name.
return _spawn
return _spawn # the `PexpectSpawner`, type alias.
@pytest.fixture(
@ -111,7 +121,7 @@ def ctlc(
# XXX: disable pygments highlighting for auto-tests
# since some envs (like actions CI) will struggle
# the the added color-char encoding..
from tractor.devx._debug import TractorConfig
from tractor.devx.debug import TractorConfig
TractorConfig.use_pygements = False
yield use_ctlc

View File

@ -1,19 +1,23 @@
"""
That "native" debug mode better work!
All these tests can be understood (somewhat) by running the equivalent
`examples/debugging/` scripts manually.
All these tests can be understood (somewhat) by running the
equivalent `examples/debugging/` scripts manually.
TODO:
- none of these tests have been run successfully on windows yet but
there's been manual testing that verified it works.
- wonder if any of it'll work on OS X?
- none of these tests have been run successfully on windows yet but
there's been manual testing that verified it works.
- wonder if any of it'll work on OS X?
"""
from __future__ import annotations
from functools import partial
import itertools
import platform
import time
from typing import (
TYPE_CHECKING,
)
import pytest
from pexpect.exceptions import (
@ -34,6 +38,9 @@ from .conftest import (
assert_before,
)
if TYPE_CHECKING:
from ..conftest import PexpectSpawner
# TODO: The next great debugger audit could be done by you!
# - recurrent entry to breakpoint() from single actor *after* and an
# error in another task?
@ -310,7 +317,6 @@ def test_subactor_breakpoint(
assert in_prompt_msg(
child, [
'MessagingError:',
'RemoteActorError:',
"('breakpoint_forever'",
'bdb.BdbQuit',
@ -528,7 +534,7 @@ def test_multi_daemon_subactors(
# now the root actor won't clobber the bp_forever child
# during it's first access to the debug lock, but will instead
# wait for the lock to release, by the edge triggered
# ``devx._debug.Lock.no_remote_has_tty`` event before sending cancel messages
# ``devx.debug.Lock.no_remote_has_tty`` event before sending cancel messages
# (via portals) to its underlings B)
# at some point here there should have been some warning msg from
@ -919,6 +925,7 @@ def test_post_mortem_api(
"<Task 'name_error'",
"NameError",
"('child'",
'getattr(doggypants)', # exc-LoC
]
)
if ctlc:
@ -935,8 +942,8 @@ def test_post_mortem_api(
"<Task '__main__.main'",
"('root'",
"NameError",
"tractor.post_mortem()",
"src_uid=('child'",
"tractor.post_mortem()", # in `main()`-LoC
]
)
if ctlc:
@ -954,6 +961,10 @@ def test_post_mortem_api(
"('root'",
"NameError",
"src_uid=('child'",
# raising line in `main()` but from crash-handling
# in `tractor.open_nursery()`.
'async with p.open_context(name_error) as (ctx, first):',
]
)
if ctlc:
@ -1063,6 +1074,136 @@ def test_shield_pause(
child.expect(EOF)
@pytest.mark.parametrize(
'quit_early', [False, True]
)
def test_ctxep_pauses_n_maybe_ipc_breaks(
spawn: PexpectSpawner,
quit_early: bool,
):
'''
Audit generator embedded `.pause()`es from within a `@context`
endpoint with a chan close at the end, requiring that ctl-c is
mashed and zombie reaper kills sub with no hangs.
'''
child = spawn('subactor_bp_in_ctx')
child.expect(PROMPT)
# 3 iters for the `gen()` pause-points
for i in range(3):
assert_before(
child,
[
_pause_msg,
"('bp_boi'", # actor name
"<Task 'just_bp'", # task name
]
)
if (
i == 1
and
quit_early
):
child.sendline('q')
child.expect(PROMPT)
assert_before(
child,
["tractor._exceptions.RemoteActorError: remote task raised a 'BdbQuit'",
"bdb.BdbQuit",
"('bp_boi'",
]
)
child.sendline('c')
child.expect(EOF)
assert_before(
child,
["tractor._exceptions.RemoteActorError: remote task raised a 'BdbQuit'",
"bdb.BdbQuit",
"('bp_boi'",
]
)
break # end-of-test
child.sendline('c')
try:
child.expect(PROMPT)
except TIMEOUT:
# no prompt since we hang due to IPC chan purposely
# closed so verify we see error reporting as well as
# a failed crash-REPL request msg and can CTL-c our way
# out.
assert_before(
child,
['peer IPC channel closed abruptly?',
'another task closed this fd',
'Debug lock request was CANCELLED?',
"TransportClosed: 'MsgpackUDSStream' was already closed locally ?",]
# XXX races on whether these show/hit?
# 'Failed to REPl via `_pause()` You called `tractor.pause()` from an already cancelled scope!',
# 'AssertionError',
)
# OSc(ancel) the hanging tree
do_ctlc(
child=child,
expect_prompt=False,
)
child.expect(EOF)
assert_before(
child,
['KeyboardInterrupt'],
)
def test_crash_handling_within_cancelled_root_actor(
spawn: PexpectSpawner,
):
'''
Ensure that when only a root-actor is started via `open_root_actor()`
we can crash-handle in debug-mode despite self-cancellation.
More-or-less ensures we conditionally shield the pause in
`._root.open_root_actor()`'s `await debug._maybe_enter_pm()`
call.
'''
child = spawn('root_self_cancelled_w_error')
child.expect(PROMPT)
assert_before(
child,
[
"Actor.cancel_soon()` was called!",
"root cancelled",
_pause_msg,
"('root'", # actor name
]
)
child.sendline('c')
child.expect(PROMPT)
assert_before(
child,
[
_crash_msg,
"('root'", # actor name
"AssertionError",
"assert 0",
]
)
child.sendline('c')
child.expect(EOF)
assert_before(
child,
[
"AssertionError",
"assert 0",
]
)
# TODO: better error for "non-ideal" usage from the root actor.
# -[ ] if called from an async scope emit a message that suggests
# using `await tractor.pause()` instead since it's less overhead

View File

@ -13,9 +13,16 @@ TODO:
when debugging a problem inside the stack vs. in their app.
'''
from __future__ import annotations
from contextlib import (
contextmanager as cm,
)
import os
import signal
import time
from typing import (
TYPE_CHECKING,
)
from .conftest import (
expect,
@ -24,14 +31,19 @@ from .conftest import (
PROMPT,
_pause_msg,
)
import pytest
from pexpect.exceptions import (
# TIMEOUT,
EOF,
)
if TYPE_CHECKING:
from ..conftest import PexpectSpawner
def test_shield_pause(
spawn,
spawn: PexpectSpawner,
):
'''
Verify the `tractor.pause()/.post_mortem()` API works inside an
@ -109,9 +121,11 @@ def test_shield_pause(
child.pid,
signal.SIGINT,
)
from tractor._supervise import _shutdown_msg
expect(
child,
'Shutting down actor runtime',
# 'Shutting down actor runtime',
_shutdown_msg,
timeout=6,
)
assert_before(
@ -126,7 +140,7 @@ def test_shield_pause(
def test_breakpoint_hook_restored(
spawn,
spawn: PexpectSpawner,
):
'''
Ensures our actor runtime sets a custom `breakpoint()` hook
@ -140,16 +154,22 @@ def test_breakpoint_hook_restored(
child = spawn('restore_builtin_breakpoint')
child.expect(PROMPT)
assert_before(
child,
[
_pause_msg,
"<Task '__main__.main'",
"('root'",
"first bp, tractor hook set",
]
)
child.sendline('c')
try:
assert_before(
child,
[
_pause_msg,
"<Task '__main__.main'",
"('root'",
"first bp, tractor hook set",
]
)
# XXX if the above raises `AssertionError`, without sending
# the final 'continue' cmd to the REPL-active sub-process,
# we'll hang waiting for that pexpect instance to terminate..
finally:
child.sendline('c')
child.expect(PROMPT)
assert_before(
child,
@ -170,3 +190,117 @@ def test_breakpoint_hook_restored(
)
child.sendline('c')
child.expect(EOF)
_to_raise = Exception('Triggering a crash')
@pytest.mark.parametrize(
'to_raise',
[
None,
_to_raise,
RuntimeError('Never crash handle this!'),
],
)
@pytest.mark.parametrize(
'raise_on_exit',
[
True,
[type(_to_raise)],
False,
]
)
def test_crash_handler_cms(
debug_mode: bool,
to_raise: Exception,
raise_on_exit: bool|list[Exception],
):
'''
Verify the `.devx.open_crash_handler()` API(s) by also
(conveniently enough) tesing its `repl_fixture: ContextManager`
param support which for this suite allows use to avoid use of
a `pexpect`-style-test since we use the fixture to avoid actually
entering `PdbpREPL.iteract()` :smirk:
'''
import tractor
# import trio
# state flags
repl_acquired: bool = False
repl_released: bool = False
@cm
def block_repl_ux(
repl: tractor.devx.debug.PdbREPL,
maybe_bxerr: (
tractor.devx._debug.BoxedMaybeException
|None
) = None,
enter_repl: bool = True,
) -> bool:
'''
Set pre/post-REPL state vars and bypass actual conole
interaction.
'''
nonlocal repl_acquired, repl_released
# task: trio.Task = trio.lowlevel.current_task()
# print(f'pre-REPL active_task={task.name}')
print('pre-REPL')
repl_acquired = True
yield False # never actually .interact()
print('post-REPL')
repl_released = True
try:
# TODO, with runtime's `debug_mode` setting
# -[ ] need to open runtime tho obvi..
#
# with tractor.devx.maybe_open_crash_handler(
# pdb=True,
with tractor.devx.open_crash_handler(
raise_on_exit=raise_on_exit,
repl_fixture=block_repl_ux
) as bxerr:
if to_raise is not None:
raise to_raise
except Exception as _exc:
exc = _exc
if (
raise_on_exit is True
or
type(to_raise) in raise_on_exit
):
assert (
exc
is
to_raise
is
bxerr.value
)
else:
raise
else:
assert (
to_raise is None
or
not raise_on_exit
or
type(to_raise) not in raise_on_exit
)
assert bxerr.value is to_raise
assert bxerr.raise_on_exit == raise_on_exit
if to_raise is not None:
assert repl_acquired
assert repl_released

View File

@ -0,0 +1,4 @@
'''
`tractor.ipc` subsystem(s)/unit testing suites.
'''

View File

@ -0,0 +1,114 @@
'''
Unit-ish tests for specific IPC transport protocol backends.
'''
from __future__ import annotations
from pathlib import Path
import pytest
import trio
import tractor
from tractor import (
Actor,
_state,
_addr,
)
@pytest.fixture
def bindspace_dir_str() -> str:
rt_dir: Path = tractor._state.get_rt_dir()
bs_dir: Path = rt_dir / 'doggy'
bs_dir_str: str = str(bs_dir)
assert not bs_dir.is_dir()
yield bs_dir_str
# delete it on suite teardown.
# ?TODO? should we support this internally
# or is leaking it ok?
if bs_dir.is_dir():
bs_dir.rmdir()
def test_uds_bindspace_created_implicitly(
debug_mode: bool,
bindspace_dir_str: str,
):
registry_addr: tuple = (
f'{bindspace_dir_str}',
'registry@doggy.sock',
)
bs_dir_str: str = registry_addr[0]
# XXX, ensure bindspace-dir DNE beforehand!
assert not Path(bs_dir_str).is_dir()
async def main():
async with tractor.open_nursery(
enable_transports=['uds'],
registry_addrs=[registry_addr],
debug_mode=debug_mode,
) as _an:
# XXX MUST be created implicitly by
# `.ipc._uds.start_listener()`!
assert Path(bs_dir_str).is_dir()
root: Actor = tractor.current_actor()
assert root.is_registrar
assert registry_addr in root.reg_addrs
assert (
registry_addr
in
_state._runtime_vars['_registry_addrs']
)
assert (
_addr.wrap_address(registry_addr)
in
root.registry_addrs
)
trio.run(main)
def test_uds_double_listen_raises_connerr(
debug_mode: bool,
bindspace_dir_str: str,
):
registry_addr: tuple = (
f'{bindspace_dir_str}',
'registry@doggy.sock',
)
async def main():
async with tractor.open_nursery(
enable_transports=['uds'],
registry_addrs=[registry_addr],
debug_mode=debug_mode,
) as _an:
# runtime up
root: Actor = tractor.current_actor()
from tractor.ipc._uds import (
start_listener,
UDSAddress,
)
ya_bound_addr: UDSAddress = root.registry_addrs[0]
try:
await start_listener(
addr=ya_bound_addr,
)
except ConnectionError as connerr:
assert type(src_exc := connerr.__context__) is OSError
assert 'Address already in use' in src_exc.args
# complete, exit test.
else:
pytest.fail('It dint raise a connerr !?')
trio.run(main)

View File

@ -0,0 +1,95 @@
'''
Verify the `enable_transports` param drives various
per-root/sub-actor IPC endpoint/server settings.
'''
from __future__ import annotations
import pytest
import trio
import tractor
from tractor import (
Actor,
Portal,
ipc,
msg,
_state,
_addr,
)
@tractor.context
async def chk_tpts(
ctx: tractor.Context,
tpt_proto_key: str,
):
rtvars = _state._runtime_vars
assert (
tpt_proto_key
in
rtvars['_enable_tpts']
)
actor: Actor = tractor.current_actor()
spec: msg.types.SpawnSpec = actor._spawn_spec
assert spec._runtime_vars == rtvars
# ensure individual IPC ep-addr types
serv: ipc._server.Server = actor.ipc_server
addr: ipc._types.Address
for addr in serv.addrs:
assert addr.proto_key == tpt_proto_key
# Actor delegate-props enforcement
assert (
actor.accept_addrs
==
serv.accept_addrs
)
await ctx.started(serv.accept_addrs)
# TODO, parametrize over mis-matched-proto-typed `registry_addrs`
# since i seems to work in `piker` but not exactly sure if both tcp
# & uds are being deployed then?
#
@pytest.mark.parametrize(
'tpt_proto_key',
['tcp', 'uds'],
ids=lambda item: f'ipc_tpt={item!r}'
)
def test_root_passes_tpt_to_sub(
tpt_proto_key: str,
reg_addr: tuple,
debug_mode: bool,
):
async def main():
async with tractor.open_nursery(
enable_transports=[tpt_proto_key],
registry_addrs=[reg_addr],
debug_mode=debug_mode,
) as an:
assert (
tpt_proto_key
in
_state._runtime_vars['_enable_tpts']
)
ptl: Portal = await an.start_actor(
name='sub',
enable_modules=[__name__],
)
async with ptl.open_context(
chk_tpts,
tpt_proto_key=tpt_proto_key,
) as (ctx, accept_addrs):
uw_addr: tuple
for uw_addr in accept_addrs:
addr = _addr.wrap_address(uw_addr)
assert addr.is_valid
# shudown sub-actor(s)
await an.cancel()
trio.run(main)

View File

@ -0,0 +1,72 @@
'''
High-level `.ipc._server` unit tests.
'''
from __future__ import annotations
import pytest
import trio
from tractor import (
devx,
ipc,
log,
)
from tractor._testing.addr import (
get_rando_addr,
)
# TODO, use/check-roundtripping with some of these wrapper types?
#
# from .._addr import Address
# from ._chan import Channel
# from ._transport import MsgTransport
# from ._uds import UDSAddress
# from ._tcp import TCPAddress
@pytest.mark.parametrize(
'_tpt_proto',
['uds', 'tcp']
)
def test_basic_ipc_server(
_tpt_proto: str,
debug_mode: bool,
loglevel: str,
):
# so we see the socket-listener reporting on console
log.get_console_log("INFO")
rando_addr: tuple = get_rando_addr(
tpt_proto=_tpt_proto,
)
async def main():
async with ipc._server.open_ipc_server() as server:
assert (
server._parent_tn
and
server._parent_tn is server._stream_handler_tn
)
assert server._no_more_peers.is_set()
eps: list[ipc._server.Endpoint] = await server.listen_on(
accept_addrs=[rando_addr],
stream_handler_nursery=None,
)
assert (
len(eps) == 1
and
(ep := eps[0])._listener
and
not ep.peer_tpts
)
server._parent_tn.cancel_scope.cancel()
# !TODO! actually make a bg-task connection from a client
# using `ipc._chan._connect_chan()`
with devx.maybe_open_crash_handler(
pdb=debug_mode,
):
trio.run(main)

View File

@ -10,6 +10,9 @@ import pytest
from _pytest.pathlib import import_path
import trio
import tractor
from tractor import (
TransportClosed,
)
from tractor._testing import (
examples_dir,
break_ipc,
@ -74,6 +77,7 @@ def test_ipc_channel_break_during_stream(
spawn_backend: str,
ipc_break: dict|None,
pre_aclose_msgstream: bool,
tpt_proto: str,
):
'''
Ensure we can have an IPC channel break its connection during
@ -91,7 +95,7 @@ def test_ipc_channel_break_during_stream(
# non-`trio` spawners should never hit the hang condition that
# requires the user to do ctl-c to cancel the actor tree.
# expect_final_exc = trio.ClosedResourceError
expect_final_exc = tractor.TransportClosed
expect_final_exc = TransportClosed
mod: ModuleType = import_path(
examples_dir() / 'advanced_faults'
@ -104,6 +108,8 @@ def test_ipc_channel_break_during_stream(
# period" wherein the user eventually hits ctl-c to kill the
# root-actor tree.
expect_final_exc: BaseException = KeyboardInterrupt
expect_final_cause: BaseException|None = None
if (
# only expect EoC if trans is broken on the child side,
ipc_break['break_child_ipc_after'] is not False
@ -138,6 +144,9 @@ def test_ipc_channel_break_during_stream(
# a user sending ctl-c by raising a KBI.
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
if tpt_proto == 'uds':
expect_final_exc = TransportClosed
expect_final_cause = trio.BrokenResourceError
# XXX OLD XXX
# if child calls `MsgStream.aclose()` then expect EoC.
@ -157,6 +166,10 @@ def test_ipc_channel_break_during_stream(
if pre_aclose_msgstream:
expect_final_exc = KeyboardInterrupt
if tpt_proto == 'uds':
expect_final_exc = TransportClosed
expect_final_cause = trio.BrokenResourceError
# NOTE when the parent IPC side dies (even if the child does as well
# but the child fails BEFORE the parent) we always expect the
# IPC layer to raise a closed-resource, NEVER do we expect
@ -169,8 +182,8 @@ def test_ipc_channel_break_during_stream(
and
ipc_break['break_child_ipc_after'] is False
):
# expect_final_exc = trio.ClosedResourceError
expect_final_exc = tractor.TransportClosed
expect_final_cause = trio.ClosedResourceError
# BOTH but, PARENT breaks FIRST
elif (
@ -181,8 +194,8 @@ def test_ipc_channel_break_during_stream(
ipc_break['break_parent_ipc_after']
)
):
# expect_final_exc = trio.ClosedResourceError
expect_final_exc = tractor.TransportClosed
expect_final_cause = trio.ClosedResourceError
with pytest.raises(
expected_exception=(
@ -198,6 +211,7 @@ def test_ipc_channel_break_during_stream(
start_method=spawn_backend,
loglevel=loglevel,
pre_close=pre_aclose_msgstream,
tpt_proto=tpt_proto,
**ipc_break,
)
)
@ -220,10 +234,15 @@ def test_ipc_channel_break_during_stream(
)
cause: Exception = tc.__cause__
assert (
type(cause) is trio.ClosedResourceError
and
cause.args[0] == 'another task closed this fd'
# type(cause) is trio.ClosedResourceError
type(cause) is expect_final_cause
# TODO, should we expect a certain exc-message (per
# tpt) as well??
# and
# cause.args[0] == 'another task closed this fd'
)
raise
# get raw instance from pytest wrapper

View File

@ -313,9 +313,8 @@ async def inf_streamer(
# `trio.EndOfChannel` doesn't propagate directly to the above
# .open_stream() parent, resulting in it also raising instead
# of gracefully absorbing as normal.. so how to handle?
trio.open_nursery(
strict_exception_groups=False,
) as tn,
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
):
async def close_stream_on_sentinel():
async for msg in stream:

View File

@ -236,7 +236,10 @@ async def stream_forever():
async def test_cancel_infinite_streamer(start_method):
# stream for at most 1 seconds
with trio.move_on_after(1) as cancel_scope:
with (
trio.fail_after(4),
trio.move_on_after(1) as cancel_scope
):
async with tractor.open_nursery() as n:
portal = await n.start_actor(
'donny',
@ -284,20 +287,32 @@ async def test_cancel_infinite_streamer(start_method):
],
)
@tractor_test
async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
"""Verify a subset of failed subactors causes all others in
async def test_some_cancels_all(
num_actors_and_errs: tuple,
start_method: str,
loglevel: str,
):
'''
Verify a subset of failed subactors causes all others in
the nursery to be cancelled just like the strategy in trio.
This is the first and only supervisory strategy at the moment.
"""
num_actors, first_err, err_type, ria_func, da_func = num_actors_and_errs
'''
(
num_actors,
first_err,
err_type,
ria_func,
da_func,
) = num_actors_and_errs
try:
async with tractor.open_nursery() as n:
async with tractor.open_nursery() as an:
# spawn the same number of deamon actors which should be cancelled
dactor_portals = []
for i in range(num_actors):
dactor_portals.append(await n.start_actor(
dactor_portals.append(await an.start_actor(
f'deamon_{i}',
enable_modules=[__name__],
))
@ -307,7 +322,7 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
for i in range(num_actors):
# start actor(s) that will fail immediately
riactor_portals.append(
await n.run_in_actor(
await an.run_in_actor(
func,
name=f'actor_{i}',
**kwargs
@ -337,7 +352,8 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
# should error here with a ``RemoteActorError`` or ``MultiError``
except first_err as err:
except first_err as _err:
err = _err
if isinstance(err, BaseExceptionGroup):
assert len(err.exceptions) == num_actors
for exc in err.exceptions:
@ -348,8 +364,8 @@ async def test_some_cancels_all(num_actors_and_errs, start_method, loglevel):
elif isinstance(err, tractor.RemoteActorError):
assert err.boxed_type == err_type
assert n.cancelled is True
assert not n._children
assert an.cancelled is True
assert not an._children
else:
pytest.fail("Should have gotten a remote assertion error?")
@ -519,10 +535,15 @@ def test_cancel_via_SIGINT_other_task(
async def main():
# should never timeout since SIGINT should cancel the current program
with trio.fail_after(timeout):
async with trio.open_nursery(
strict_exception_groups=False,
) as n:
await n.start(spawn_and_sleep_forever)
async with (
# XXX ?TODO? why no work!?
# tractor.trionics.collapse_eg(),
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
await tn.start(spawn_and_sleep_forever)
if 'mp' in spawn_backend:
time.sleep(0.1)
os.kill(pid, signal.SIGINT)
@ -533,38 +554,123 @@ def test_cancel_via_SIGINT_other_task(
async def spin_for(period=3):
"Sync sleep."
print(f'sync sleeping in sub-sub for {period}\n')
time.sleep(period)
async def spawn():
async with tractor.open_nursery() as tn:
await tn.run_in_actor(
async def spawn_sub_with_sync_blocking_task():
async with tractor.open_nursery() as an:
print('starting sync blocking subactor..\n')
await an.run_in_actor(
spin_for,
name='sleeper',
)
print('exiting first subactor layer..\n')
@pytest.mark.parametrize(
'man_cancel_outer',
[
False, # passes if delay != 2
# always causes an unexpected eg-w-embedded-assert-err?
pytest.param(True,
marks=pytest.mark.xfail(
reason=(
'always causes an unexpected eg-w-embedded-assert-err?'
)
),
),
],
)
@no_windows
def test_cancel_while_childs_child_in_sync_sleep(
loglevel,
start_method,
spawn_backend,
loglevel: str,
start_method: str,
spawn_backend: str,
debug_mode: bool,
reg_addr: tuple,
man_cancel_outer: bool,
):
"""Verify that a child cancelled while executing sync code is torn
'''
Verify that a child cancelled while executing sync code is torn
down even when that cancellation is triggered by the parent
2 nurseries "up".
"""
Though the grandchild should stay blocking its actor runtime, its
parent should issue a "zombie reaper" to hard kill it after
sufficient timeout.
'''
if start_method == 'forkserver':
pytest.skip("Forksever sux hard at resuming from sync sleep...")
async def main():
with trio.fail_after(2):
async with tractor.open_nursery() as tn:
await tn.run_in_actor(
spawn,
name='spawn',
#
# XXX BIG TODO NOTE XXX
#
# it seems there's a strange race that can happen
# where where the fail-after will trigger outer scope
# .cancel() which then causes the inner scope to raise,
#
# BaseExceptionGroup('Exceptions from Trio nursery', [
# BaseExceptionGroup('Exceptions from Trio nursery',
# [
# Cancelled(),
# Cancelled(),
# ]
# ),
# AssertionError('assert 0')
# ])
#
# WHY THIS DOESN'T MAKE SENSE:
# ---------------------------
# - it should raise too-slow-error when too slow..
# * verified that using simple-cs and manually cancelling
# you get same outcome -> indicates that the fail-after
# can have its TooSlowError overriden!
# |_ to check this it's easy, simplly decrease the timeout
# as per the var below.
#
# - when using the manual simple-cs the outcome is different
# DESPITE the `assert 0` which means regardless of the
# inner scope effectively failing in the same way, the
# bubbling up **is NOT the same**.
#
# delays trigger diff outcomes..
# ---------------------------
# as seen by uncommenting various lines below there is from
# my POV an unexpected outcome due to the delay=2 case.
#
# delay = 1 # no AssertionError in eg, TooSlowError raised.
# delay = 2 # is AssertionError in eg AND no TooSlowError !?
delay = 4 # is AssertionError in eg AND no _cs cancellation.
with trio.fail_after(delay) as _cs:
# with trio.CancelScope() as cs:
# ^XXX^ can be used instead to see same outcome.
async with (
# tractor.trionics.collapse_eg(), # doesn't help
tractor.open_nursery(
hide_tb=False,
debug_mode=debug_mode,
registry_addrs=[reg_addr],
) as an,
):
await an.run_in_actor(
spawn_sub_with_sync_blocking_task,
name='sync_blocking_sub',
)
await trio.sleep(1)
if man_cancel_outer:
print('Cancelling manually in root')
_cs.cancel()
# trigger exc-srced taskc down
# the actor tree.
print('RAISING IN ROOT')
assert 0
with pytest.raises(AssertionError):

View File

@ -117,9 +117,10 @@ async def open_actor_local_nursery(
ctx: tractor.Context,
):
global _nursery
async with trio.open_nursery(
strict_exception_groups=False,
) as tn:
async with (
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn
):
_nursery = tn
await ctx.started()
await trio.sleep(10)

View File

@ -13,26 +13,24 @@ MESSAGE = 'tractoring at full speed'
def test_empty_mngrs_input_raises() -> None:
async def main():
with trio.fail_after(1):
with trio.fail_after(3):
async with (
open_actor_cluster(
modules=[__name__],
# NOTE: ensure we can passthrough runtime opts
loglevel='info',
# debug_mode=True,
loglevel='cancel',
debug_mode=False,
) as portals,
gather_contexts(
# NOTE: it's the use of inline-generator syntax
# here that causes the empty input.
mngrs=(
p.open_context(worker) for p in portals.values()
),
),
gather_contexts(mngrs=()),
):
assert 0
# should fail before this?
assert portals
# test should fail if we mk it here!
assert 0, 'Should have raised val-err !?'
with pytest.raises(ValueError):
trio.run(main)

View File

@ -252,7 +252,7 @@ def test_simple_context(
pass
except BaseExceptionGroup as beg:
# XXX: on windows it seems we may have to expect the group error
from tractor._exceptions import is_multi_cancelled
from tractor.trionics import is_multi_cancelled
assert is_multi_cancelled(beg)
else:
trio.run(main)

View File

@ -7,8 +7,11 @@ import platform
from functools import partial
import itertools
import psutil
import pytest
import subprocess
import tractor
from tractor.trionics import collapse_eg
from tractor._testing import tractor_test
import trio
@ -26,7 +29,7 @@ async def test_reg_then_unreg(reg_addr):
portal = await n.start_actor('actor', enable_modules=[__name__])
uid = portal.channel.uid
async with tractor.get_registry(*reg_addr) as aportal:
async with tractor.get_registry(reg_addr) as aportal:
# this local actor should be the arbiter
assert actor is aportal.actor
@ -152,15 +155,25 @@ async def unpack_reg(actor_or_portal):
async def spawn_and_check_registry(
reg_addr: tuple,
use_signal: bool,
debug_mode: bool = False,
remote_arbiter: bool = False,
with_streaming: bool = False,
maybe_daemon: tuple[
subprocess.Popen,
psutil.Process,
]|None = None,
) -> None:
if maybe_daemon:
popen, proc = maybe_daemon
# breakpoint()
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
debug_mode=debug_mode,
):
async with tractor.get_registry(*reg_addr) as portal:
async with tractor.get_registry(reg_addr) as portal:
# runtime needs to be up to call this
actor = tractor.current_actor()
@ -176,30 +189,30 @@ async def spawn_and_check_registry(
extra = 2 # local root actor + remote arbiter
# ensure current actor is registered
registry = await get_reg()
registry: dict = await get_reg()
assert actor.uid in registry
try:
async with tractor.open_nursery() as n:
async with trio.open_nursery(
strict_exception_groups=False,
) as trion:
async with tractor.open_nursery() as an:
async with (
collapse_eg(),
trio.open_nursery() as trion,
):
portals = {}
for i in range(3):
name = f'a{i}'
if with_streaming:
portals[name] = await n.start_actor(
portals[name] = await an.start_actor(
name=name, enable_modules=[__name__])
else: # no streaming
portals[name] = await n.run_in_actor(
portals[name] = await an.run_in_actor(
trio.sleep_forever, name=name)
# wait on last actor to come up
async with tractor.wait_for_actor(name):
registry = await get_reg()
for uid in n._children:
for uid in an._children:
assert uid in registry
assert len(portals) + extra == len(registry)
@ -232,6 +245,7 @@ async def spawn_and_check_registry(
@pytest.mark.parametrize('use_signal', [False, True])
@pytest.mark.parametrize('with_streaming', [False, True])
def test_subactors_unregister_on_cancel(
debug_mode: bool,
start_method,
use_signal,
reg_addr,
@ -248,6 +262,7 @@ def test_subactors_unregister_on_cancel(
spawn_and_check_registry,
reg_addr,
use_signal,
debug_mode=debug_mode,
remote_arbiter=False,
with_streaming=with_streaming,
),
@ -257,7 +272,8 @@ def test_subactors_unregister_on_cancel(
@pytest.mark.parametrize('use_signal', [False, True])
@pytest.mark.parametrize('with_streaming', [False, True])
def test_subactors_unregister_on_cancel_remote_daemon(
daemon,
daemon: subprocess.Popen,
debug_mode: bool,
start_method,
use_signal,
reg_addr,
@ -273,8 +289,13 @@ def test_subactors_unregister_on_cancel_remote_daemon(
spawn_and_check_registry,
reg_addr,
use_signal,
debug_mode=debug_mode,
remote_arbiter=True,
with_streaming=with_streaming,
maybe_daemon=(
daemon,
psutil.Process(daemon.pid)
),
),
)
@ -300,7 +321,7 @@ async def close_chans_before_nursery(
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
):
async with tractor.get_registry(*reg_addr) as aportal:
async with tractor.get_registry(reg_addr) as aportal:
try:
get_reg = partial(unpack_reg, aportal)
@ -318,11 +339,12 @@ async def close_chans_before_nursery(
async with portal2.open_stream_from(
stream_forever
) as agen2:
async with trio.open_nursery(
strict_exception_groups=False,
) as n:
n.start_soon(streamer, agen1)
n.start_soon(cancel, use_signal, .5)
async with (
collapse_eg(),
trio.open_nursery() as tn,
):
tn.start_soon(streamer, agen1)
tn.start_soon(cancel, use_signal, .5)
try:
await streamer(agen2)
finally:
@ -373,7 +395,7 @@ def test_close_channel_explicit(
@pytest.mark.parametrize('use_signal', [False, True])
def test_close_channel_explicit_remote_arbiter(
daemon,
daemon: subprocess.Popen,
start_method,
use_signal,
reg_addr,

View File

@ -66,6 +66,9 @@ def run_example_in_subproc(
# due to backpressure!!!
proc = testdir.popen(
cmdargs,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
**kwargs,
)
assert not proc.returncode
@ -119,10 +122,14 @@ def test_example(
code = ex.read()
with run_example_in_subproc(code) as proc:
proc.wait()
err, _ = proc.stderr.read(), proc.stdout.read()
# print(f'STDERR: {err}')
# print(f'STDOUT: {out}')
err = None
try:
if not proc.poll():
_, err = proc.communicate(timeout=15)
except subprocess.TimeoutExpired as e:
proc.kill()
err = e.stderr
# if we get some gnarly output let's aggregate and raise
if err:

View File

@ -234,10 +234,8 @@ async def trio_ctx(
with trio.fail_after(1 + delay):
try:
async with (
trio.open_nursery(
# TODO, for new `trio` / py3.13
# strict_exception_groups=False,
) as tn,
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
tractor.to_asyncio.open_channel_from(
sleep_and_err,
) as (first, chan),
@ -573,14 +571,16 @@ def test_basic_interloop_channel_stream(
fan_out: bool,
):
async def main():
async with tractor.open_nursery() as an:
portal = await an.run_in_actor(
stream_from_aio,
infect_asyncio=True,
fan_out=fan_out,
)
# should raise RAE diectly
await portal.result()
# TODO, figure out min timeout here!
with trio.fail_after(6):
async with tractor.open_nursery() as an:
portal = await an.run_in_actor(
stream_from_aio,
infect_asyncio=True,
fan_out=fan_out,
)
# should raise RAE diectly
await portal.result()
trio.run(main)
@ -889,7 +889,7 @@ async def manage_file(
# NOTE: turns out you don't even need to sched an aio task
# since the original issue, even though seemingly was due to
# the guest-run being abandoned + a `._debug.pause()` inside
# the guest-run being abandoned + a `.debug.pause()` inside
# `._runtime._async_main()` (which was originally trying to
# debug the `.lifetime_stack` not closing), IS NOT actually
# the core issue?
@ -1088,6 +1088,108 @@ def test_sigint_closes_lifetime_stack(
trio.run(main)
# ?TODO asyncio.Task fn-deco?
# -[ ] do sig checkingat import time like @context?
# -[ ] maybe name it @aio_task ??
# -[ ] chan: to_asyncio.InterloopChannel ??
async def raise_before_started(
# from_trio: asyncio.Queue,
# to_trio: trio.abc.SendChannel,
chan: to_asyncio.LinkedTaskChannel,
) -> None:
'''
`asyncio.Task` entry point which RTEs before calling
`to_trio.send_nowait()`.
'''
await asyncio.sleep(0.2)
raise RuntimeError('Some shite went wrong before `.send_nowait()`!!')
# to_trio.send_nowait('Uhh we shouldve RTE-d ^^ ??')
chan.started_nowait('Uhh we shouldve RTE-d ^^ ??')
await asyncio.sleep(float('inf'))
@tractor.context
async def caching_ep(
ctx: tractor.Context,
):
log = tractor.log.get_logger('caching_ep')
log.info('syncing via `ctx.started()`')
await ctx.started()
# XXX, allocate the `open_channel_from()` inside
# a `.trionics.maybe_open_context()`.
chan: to_asyncio.LinkedTaskChannel
async with (
tractor.trionics.maybe_open_context(
acm_func=tractor.to_asyncio.open_channel_from,
kwargs={
'target': raise_before_started,
# ^XXX, kwarg to `open_channel_from()`
},
# lock around current actor task access
key=tractor.current_actor().uid,
) as (cache_hit, (clients, chan)),
):
if cache_hit:
log.error(
'Re-using cached `.open_from_channel()` call!\n'
)
else:
log.info(
'Allocating SHOULD-FAIL `.open_from_channel()`\n'
)
await trio.sleep_forever()
def test_aio_side_raises_before_started(
reg_addr: tuple[str, int],
debug_mode: bool,
loglevel: str,
):
'''
Simulates connection-err from `piker.brokers.ib.api`..
Ensure any error raised by child-`asyncio.Task` BEFORE
`chan.started()`
'''
# delay = 999 if debug_mode else 1
async def main():
with trio.fail_after(3):
an: tractor.ActorNursery
async with tractor.open_nursery(
debug_mode=debug_mode,
loglevel=loglevel,
) as an:
p: tractor.Portal = await an.start_actor(
'lchan_cacher_that_raises_fast',
enable_modules=[__name__],
infect_asyncio=True,
)
async with p.open_context(
caching_ep,
) as (ctx, first):
assert not first
with pytest.raises(
expected_exception=(RemoteActorError),
) as excinfo:
trio.run(main)
# ensure `asyncio.Task` exception is bubbled
# allll the way erp!!
rae = excinfo.value
assert rae.boxed_type is RuntimeError
# TODO: debug_mode tests once we get support for `asyncio`!
#
# -[ ] need tests to wrap both scripts:
@ -1101,7 +1203,7 @@ def test_sigint_closes_lifetime_stack(
# => completed using `.bestow_portal(task)` inside
# `.to_asyncio._run_asyncio_task()` right?
# -[ ] translation func to get from `asyncio` task calling to
# `._debug.wait_for_parent_stdin_hijack()` which does root
# `.debug.wait_for_parent_stdin_hijack()` which does root
# call to do TTY locking.
#
def test_sync_breakpoint():

View File

@ -410,7 +410,6 @@ def test_peer_canceller(
'''
async def main():
async with tractor.open_nursery(
# NOTE: to halt the peer tasks on ctxc, uncomment this.
debug_mode=debug_mode,
) as an:
canceller: Portal = await an.start_actor(
@ -871,7 +870,7 @@ async def serve_subactors(
)
await ipc.send((
peer.chan.uid,
peer.chan.raddr,
peer.chan.raddr.unwrap(),
))
print('Spawner exiting spawn serve loop!')

View File

@ -235,10 +235,16 @@ async def cancel_after(wait, reg_addr):
@pytest.fixture(scope='module')
def time_quad_ex(reg_addr, ci_env, spawn_backend):
def time_quad_ex(
reg_addr: tuple,
ci_env: bool,
spawn_backend: str,
):
if spawn_backend == 'mp':
"""no idea but the mp *nix runs are flaking out here often...
"""
'''
no idea but the mp *nix runs are flaking out here often...
'''
pytest.skip("Test is too flaky on mp in CI")
timeout = 7 if platform.system() in ('Windows', 'Darwin') else 4
@ -249,12 +255,24 @@ def time_quad_ex(reg_addr, ci_env, spawn_backend):
return results, diff
def test_a_quadruple_example(time_quad_ex, ci_env, spawn_backend):
"""This also serves as a kind of "we'd like to be this fast test"."""
def test_a_quadruple_example(
time_quad_ex: tuple,
ci_env: bool,
spawn_backend: str,
):
'''
This also serves as a kind of "we'd like to be this fast test".
'''
results, diff = time_quad_ex
assert results
this_fast = 6 if platform.system() in ('Windows', 'Darwin') else 3
this_fast = (
6 if platform.system() in (
'Windows',
'Darwin',
)
else 3
)
assert diff < this_fast

View File

@ -38,7 +38,7 @@ async def test_self_is_registered_localportal(reg_addr):
"Verify waiting on the arbiter to register itself using a local portal."
actor = tractor.current_actor()
assert actor.is_arbiter
async with tractor.get_registry(*reg_addr) as portal:
async with tractor.get_registry(reg_addr) as portal:
assert isinstance(portal, tractor._portal.LocalPortal)
with trio.fail_after(0.2):

View File

@ -32,7 +32,7 @@ def test_abort_on_sigint(daemon):
@tractor_test
async def test_cancel_remote_arbiter(daemon, reg_addr):
assert not tractor.current_actor().is_arbiter
async with tractor.get_registry(*reg_addr) as portal:
async with tractor.get_registry(reg_addr) as portal:
await portal.cancel_actor()
time.sleep(0.1)
@ -41,7 +41,7 @@ async def test_cancel_remote_arbiter(daemon, reg_addr):
# no arbiter socket should exist
with pytest.raises(OSError):
async with tractor.get_registry(*reg_addr) as portal:
async with tractor.get_registry(reg_addr) as portal:
pass

View File

@ -0,0 +1,237 @@
'''
Special case testing for issues not (dis)covered in the primary
`Context` related functional/scenario suites.
**NOTE: this mod is a WIP** space for handling
odd/rare/undiscovered/not-yet-revealed faults which either
loudly (ideal case) breakl our supervision protocol
or (worst case) result in distributed sys hangs.
Suites here further try to clarify (if [partially] ill-defined) and
verify our edge case semantics for inter-actor-relayed-exceptions
including,
- lowlevel: what remote obj-data is interchanged for IPC and what is
native-obj form is expected from unpacking in the the new
mem-domain.
- which kinds of `RemoteActorError` (and its derivs) are expected by which
(types of) peers (parent, child, sibling, etc) with what
particular meta-data set such as,
- `.src_uid`: the original (maybe) peer who raised.
- `.relay_uid`: the next-hop-peer who sent it.
- `.relay_path`: the sequence of peer actor hops.
- `.is_inception`: a predicate that denotes multi-hop remote errors.
- when should `ExceptionGroup`s be relayed from a particular
remote endpoint, they should never be caused by implicit `._rpc`
nursery machinery!
- various special `trio` edge cases around its cancellation semantics
and how we (currently) leverage `trio.Cancelled` as a signal for
whether a `Context` task should raise `ContextCancelled` (ctx).
'''
import pytest
import trio
import tractor
from tractor import ( # typing
ActorNursery,
Portal,
Context,
ContextCancelled,
)
@tractor.context
async def sleep_n_chkpt_in_finally(
ctx: Context,
sleep_n_raise: bool,
chld_raise_delay: float,
chld_finally_delay: float,
rent_cancels: bool,
rent_ctxc_delay: float,
expect_exc: str|None = None,
) -> None:
'''
Sync, open a tn, then wait for cancel, run a chkpt inside
the user's `finally:` teardown.
This covers a footgun case that `trio` core doesn't seem to care about
wherein an exc can be masked by a `trio.Cancelled` raised inside a tn emedded
`finally:`.
Also see `test_trioisms::test_acm_embedded_nursery_propagates_enter_err`
for the down and gritty details.
Since a `@context` endpoint fn can also contain code like this,
**and** bc we currently have no easy way other then
`trio.Cancelled` to signal cancellation on each side of an IPC `Context`,
the footgun issue can compound itself as demonstrated in this suite..
Here are some edge cases codified with our WIP "sclang" syntax
(note the parent(rent)/child(chld) naming here is just
pragmatism, generally these most of these cases can occurr
regardless of the distributed-task's supervision hiearchy),
- rent c)=> chld.raises-then-taskc-in-finally
|_ chld's body raises an `exc: BaseException`.
_ in its `finally:` block it runs a chkpoint
which raises a taskc (`trio.Cancelled`) which
masks `exc` instead raising taskc up to the first tn.
_ the embedded/chld tn captures the masking taskc and then
raises it up to the ._rpc-ep-tn instead of `exc`.
_ the rent thinks the child ctxc-ed instead of errored..
'''
await ctx.started()
if expect_exc:
expect_exc: BaseException = tractor._exceptions.get_err_type(
type_name=expect_exc,
)
berr: BaseException|None = None
try:
if not sleep_n_raise:
await trio.sleep_forever()
elif sleep_n_raise:
# XXX this sleep is less then the sleep the parent
# does before calling `ctx.cancel()`
await trio.sleep(chld_raise_delay)
# XXX this will be masked by a taskc raised in
# the `finally:` if this fn doesn't terminate
# before any ctxc-req arrives AND a checkpoint is hit
# in that `finally:`.
raise RuntimeError('my app krurshed..')
except BaseException as _berr:
berr = _berr
# TODO: it'd sure be nice to be able to inject our own
# `ContextCancelled` here instead of of `trio.Cancelled`
# so that our runtime can expect it and this "user code"
# would be able to tell the diff between a generic trio
# cancel and a tractor runtime-IPC cancel.
if expect_exc:
if not isinstance(
berr,
expect_exc,
):
raise ValueError(
f'Unexpected exc type ??\n'
f'{berr!r}\n'
f'\n'
f'Expected a {expect_exc!r}\n'
)
raise berr
# simulate what user code might try even though
# it's a known boo-boo..
finally:
# maybe wait for rent ctxc to arrive
with trio.CancelScope(shield=True):
await trio.sleep(chld_finally_delay)
# !!XXX this will raise `trio.Cancelled` which
# will mask the RTE from above!!!
#
# YES, it's the same case as our extant
# `test_trioisms::test_acm_embedded_nursery_propagates_enter_err`
try:
await trio.lowlevel.checkpoint()
except trio.Cancelled as taskc:
if (scope_err := taskc.__context__):
print(
f'XXX MASKED REMOTE ERROR XXX\n'
f'ENDPOINT exception -> {scope_err!r}\n'
f'will be masked by -> {taskc!r}\n'
)
# await tractor.pause(shield=True)
raise taskc
@pytest.mark.parametrize(
'chld_callspec',
[
dict(
sleep_n_raise=None,
chld_raise_delay=0.1,
chld_finally_delay=0.1,
expect_exc='Cancelled',
rent_cancels=True,
rent_ctxc_delay=0.1,
),
dict(
sleep_n_raise='RuntimeError',
chld_raise_delay=0.1,
chld_finally_delay=1,
expect_exc='RuntimeError',
rent_cancels=False,
rent_ctxc_delay=0.1,
),
],
ids=lambda item: f'chld_callspec={item!r}'
)
def test_unmasked_remote_exc(
debug_mode: bool,
chld_callspec: dict,
tpt_proto: str,
):
expect_exc_str: str|None = chld_callspec['sleep_n_raise']
rent_ctxc_delay: float|None = chld_callspec['rent_ctxc_delay']
async def main():
an: ActorNursery
async with tractor.open_nursery(
debug_mode=debug_mode,
enable_transports=[tpt_proto],
) as an:
ptl: Portal = await an.start_actor(
'cancellee',
enable_modules=[__name__],
)
ctx: Context
async with (
ptl.open_context(
sleep_n_chkpt_in_finally,
**chld_callspec,
) as (ctx, sent),
):
assert not sent
await trio.sleep(rent_ctxc_delay)
await ctx.cancel()
# recv error or result from chld
ctxc: ContextCancelled = await ctx.wait_for_result()
assert (
ctxc is ctx.outcome
and
isinstance(ctxc, ContextCancelled)
)
# always graceful terminate the sub in non-error cases
await an.cancel()
if expect_exc_str:
expect_exc: BaseException = tractor._exceptions.get_err_type(
type_name=expect_exc_str,
)
with pytest.raises(
expected_exception=tractor.RemoteActorError,
) as excinfo:
trio.run(main)
rae = excinfo.value
assert expect_exc == rae.boxed_type
else:
trio.run(main)

View File

@ -1,5 +1,6 @@
'''
Async context manager cache api testing: ``trionics.maybe_open_context():``
Suites for our `.trionics.maybe_open_context()` multi-task
shared-cached `@acm` API.
'''
from contextlib import asynccontextmanager as acm
@ -9,6 +10,15 @@ from typing import Awaitable
import pytest
import trio
import tractor
from tractor.trionics import (
maybe_open_context,
)
from tractor.log import (
get_console_log,
get_logger,
)
log = get_logger(__name__)
_resource: int = 0
@ -52,7 +62,7 @@ def test_resource_only_entered_once(key_on):
# different task names per task will be used
kwargs = {'task_name': name}
async with tractor.trionics.maybe_open_context(
async with maybe_open_context(
maybe_increment_counter,
kwargs=kwargs,
key=key,
@ -72,11 +82,13 @@ def test_resource_only_entered_once(key_on):
with trio.move_on_after(0.5):
async with (
tractor.open_root_actor(),
trio.open_nursery() as n,
trio.open_nursery() as tn,
):
for i in range(10):
n.start_soon(enter_cached_mngr, f'task_{i}')
tn.start_soon(
enter_cached_mngr,
f'task_{i}',
)
await trio.sleep(0.001)
trio.run(main)
@ -98,27 +110,55 @@ async def streamer(
@acm
async def open_stream() -> Awaitable[tractor.MsgStream]:
async def open_stream() -> Awaitable[
tuple[
tractor.ActorNursery,
tractor.MsgStream,
]
]:
try:
async with tractor.open_nursery() as an:
portal = await an.start_actor(
'streamer',
enable_modules=[__name__],
)
try:
async with (
portal.open_context(streamer) as (ctx, first),
ctx.open_stream() as stream,
):
print('Entered open_stream() caller')
yield an, stream
print('Exited open_stream() caller')
async with tractor.open_nursery() as tn:
portal = await tn.start_actor('streamer', enable_modules=[__name__])
async with (
portal.open_context(streamer) as (ctx, first),
ctx.open_stream() as stream,
):
yield stream
finally:
print(
'Cancelling streamer with,\n'
'=> `Portal.cancel_actor()`'
)
await portal.cancel_actor()
print('Cancelled streamer')
await portal.cancel_actor()
print('CANCELLED STREAMER')
except Exception as err:
print(
f'`open_stream()` errored?\n'
f'{err!r}\n'
)
await tractor.pause(shield=True)
raise err
@acm
async def maybe_open_stream(taskname: str):
async with tractor.trionics.maybe_open_context(
async with maybe_open_context(
# NOTE: all secondary tasks should cache hit on the same key
acm_func=open_stream,
) as (cache_hit, stream):
) as (
cache_hit,
(an, stream)
):
# when the actor + portal + ctx + stream has already been
# allocated we want to just bcast to this task.
if cache_hit:
print(f'{taskname} loaded from cache')
@ -126,27 +166,77 @@ async def maybe_open_stream(taskname: str):
# if this feed is already allocated by the first
# task that entereed
async with stream.subscribe() as bstream:
yield bstream
yield an, bstream
print(
f'cached task exited\n'
f')>\n'
f' |_{taskname}\n'
)
# we should always unreg the "cloned" bcrc for this
# consumer-task
assert id(bstream) not in bstream._state.subs
else:
# yield the actual stream
yield stream
try:
yield an, stream
finally:
print(
f'NON-cached task exited\n'
f')>\n'
f' |_{taskname}\n'
)
first_bstream = stream._broadcaster
bcrx_state = first_bstream._state
subs: dict[int, int] = bcrx_state.subs
if len(subs) == 1:
assert id(first_bstream) in subs
# ^^TODO! the bcrx should always de-allocate all subs,
# including the implicit first one allocated on entry
# by the first subscribing peer task, no?
#
# -[ ] adjust `MsgStream.subscribe()` to do this mgmt!
# |_ allows reverting `MsgStream.receive()` to the
# non-bcaster method.
# |_ we can decide whether to reset `._broadcaster`?
#
# await tractor.pause(shield=True)
def test_open_local_sub_to_stream():
def test_open_local_sub_to_stream(
debug_mode: bool,
):
'''
Verify a single inter-actor stream can can be fanned-out shared to
N local tasks using ``trionics.maybe_open_context():``.
N local tasks using `trionics.maybe_open_context()`.
'''
timeout: float = 3.6 if platform.system() != "Windows" else 10
timeout: float = 3.6
if platform.system() == "Windows":
timeout: float = 10
if debug_mode:
timeout = 999
print(f'IN debug_mode, setting large timeout={timeout!r}..')
async def main():
full = list(range(1000))
an: tractor.ActorNursery|None = None
num_tasks: int = 10
async def get_sub_and_pull(taskname: str):
nonlocal an
stream: tractor.MsgStream
async with (
maybe_open_stream(taskname) as stream,
maybe_open_stream(taskname) as (
an,
stream,
),
):
if '0' in taskname:
assert isinstance(stream, tractor.MsgStream)
@ -158,24 +248,159 @@ def test_open_local_sub_to_stream():
first = await stream.receive()
print(f'{taskname} started with value {first}')
seq = []
seq: list[int] = []
async for msg in stream:
seq.append(msg)
assert set(seq).issubset(set(full))
# end of @acm block
print(f'{taskname} finished')
with trio.fail_after(timeout):
root: tractor.Actor
with trio.fail_after(timeout) as cs:
# TODO: turns out this isn't multi-task entrant XD
# We probably need an indepotent entry semantic?
async with tractor.open_root_actor():
async with tractor.open_root_actor(
debug_mode=debug_mode,
# maybe_enable_greenback=True,
#
# ^TODO? doesn't seem to mk breakpoint() usage work
# bc each bg task needs to open a portal??
# - [ ] we should consider making this part of
# our taskman defaults?
# |_see https://github.com/goodboy/tractor/pull/363
#
) as root:
assert root.is_registrar
async with (
trio.open_nursery() as nurse,
trio.open_nursery() as tn,
):
for i in range(10):
nurse.start_soon(get_sub_and_pull, f'task_{i}')
for i in range(num_tasks):
tn.start_soon(
get_sub_and_pull,
f'task_{i}',
)
await trio.sleep(0.001)
print('all consumer tasks finished')
print('all consumer tasks finished!')
# ?XXX, ensure actor-nursery is shutdown or we might
# hang here due to a minor task deadlock/race-condition?
#
# - seems that all we need is a checkpoint to ensure
# the last suspended task, which is inside
# `.maybe_open_context()`, can do the
# `Portal.cancel_actor()` call?
#
# - if that bg task isn't resumed, then this blocks
# timeout might hit before that?
#
if root.ipc_server.has_peers():
await trio.lowlevel.checkpoint()
# alt approach, cancel the entire `an`
# await tractor.pause()
# await an.cancel()
# end of runtime scope
print('root actor terminated.')
if cs.cancelled_caught:
pytest.fail(
'Should NOT time out in `open_root_actor()` ?'
)
print('exiting main.')
trio.run(main)
@acm
async def cancel_outer_cs(
cs: trio.CancelScope|None = None,
delay: float = 0,
):
# on first task delay this enough to block
# the 2nd task but then cancel it mid sleep
# so that the tn.start() inside the key-err handler block
# is cancelled and would previously corrupt the
# mutext state.
log.info(f'task entering sleep({delay})')
await trio.sleep(delay)
if cs:
log.info('task calling cs.cancel()')
cs.cancel()
trio.lowlevel.checkpoint()
yield
await trio.sleep_forever()
def test_lock_not_corrupted_on_fast_cancel(
debug_mode: bool,
loglevel: str,
):
'''
Verify that if the caching-task (the first to enter
`maybe_open_context()`) is cancelled mid-cache-miss, the embedded
mutex can never be left in a corrupted state.
That is, the lock is always eventually released ensuring a peer
(cache-hitting) task will never,
- be left to inf-block/hang on the `lock.acquire()`.
- try to release the lock when still owned by the caching-task
due to it having erronously exited without calling
`lock.release()`.
'''
delay: float = 1.
async def use_moc(
cs: trio.CancelScope|None,
delay: float,
):
log.info('task entering moc')
async with maybe_open_context(
cancel_outer_cs,
kwargs={
'cs': cs,
'delay': delay,
},
) as (cache_hit, _null):
if cache_hit:
log.info('2nd task entered')
else:
log.info('1st task entered')
await trio.sleep_forever()
async def main():
with trio.fail_after(delay + 2):
async with (
tractor.open_root_actor(
debug_mode=debug_mode,
loglevel=loglevel,
),
trio.open_nursery() as tn,
):
get_console_log('info')
log.info('yo starting')
cs = tn.cancel_scope
tn.start_soon(
use_moc,
cs,
delay,
name='child',
)
with trio.CancelScope() as rent_cs:
await use_moc(
cs=rent_cs,
delay=delay,
)
trio.run(main)

View File

@ -0,0 +1,211 @@
import time
import trio
import pytest
import tractor
from tractor.ipc._ringbuf import (
open_ringbuf,
RBToken,
RingBuffSender,
RingBuffReceiver
)
from tractor._testing.samples import (
generate_sample_messages,
)
# in case you don't want to melt your cores, uncomment dis!
pytestmark = pytest.mark.skip
@tractor.context
async def child_read_shm(
ctx: tractor.Context,
msg_amount: int,
token: RBToken,
total_bytes: int,
) -> None:
recvd_bytes = 0
await ctx.started()
start_ts = time.time()
async with RingBuffReceiver(token) as receiver:
while recvd_bytes < total_bytes:
msg = await receiver.receive_some()
recvd_bytes += len(msg)
# make sure we dont hold any memoryviews
# before the ctx manager aclose()
msg = None
end_ts = time.time()
elapsed = end_ts - start_ts
elapsed_ms = int(elapsed * 1000)
print(f'\n\telapsed ms: {elapsed_ms}')
print(f'\tmsg/sec: {int(msg_amount / elapsed):,}')
print(f'\tbytes/sec: {int(recvd_bytes / elapsed):,}')
@tractor.context
async def child_write_shm(
ctx: tractor.Context,
msg_amount: int,
rand_min: int,
rand_max: int,
token: RBToken,
) -> None:
msgs, total_bytes = generate_sample_messages(
msg_amount,
rand_min=rand_min,
rand_max=rand_max,
)
await ctx.started(total_bytes)
async with RingBuffSender(token) as sender:
for msg in msgs:
await sender.send_all(msg)
@pytest.mark.parametrize(
'msg_amount,rand_min,rand_max,buf_size',
[
# simple case, fixed payloads, large buffer
(100_000, 0, 0, 10 * 1024),
# guaranteed wrap around on every write
(100, 10 * 1024, 20 * 1024, 10 * 1024),
# large payload size, but large buffer
(10_000, 256 * 1024, 512 * 1024, 10 * 1024 * 1024)
],
ids=[
'fixed_payloads_large_buffer',
'wrap_around_every_write',
'large_payloads_large_buffer',
]
)
def test_ringbuf(
msg_amount: int,
rand_min: int,
rand_max: int,
buf_size: int
):
async def main():
with open_ringbuf(
'test_ringbuf',
buf_size=buf_size
) as token:
proc_kwargs = {
'pass_fds': (token.write_eventfd, token.wrap_eventfd)
}
common_kwargs = {
'msg_amount': msg_amount,
'token': token,
}
async with tractor.open_nursery() as an:
send_p = await an.start_actor(
'ring_sender',
enable_modules=[__name__],
proc_kwargs=proc_kwargs
)
recv_p = await an.start_actor(
'ring_receiver',
enable_modules=[__name__],
proc_kwargs=proc_kwargs
)
async with (
send_p.open_context(
child_write_shm,
rand_min=rand_min,
rand_max=rand_max,
**common_kwargs
) as (sctx, total_bytes),
recv_p.open_context(
child_read_shm,
**common_kwargs,
total_bytes=total_bytes,
) as (sctx, _sent),
):
await recv_p.result()
await send_p.cancel_actor()
await recv_p.cancel_actor()
trio.run(main)
@tractor.context
async def child_blocked_receiver(
ctx: tractor.Context,
token: RBToken
):
async with RingBuffReceiver(token) as receiver:
await ctx.started()
await receiver.receive_some()
def test_ring_reader_cancel():
async def main():
with open_ringbuf('test_ring_cancel_reader') as token:
async with (
tractor.open_nursery() as an,
RingBuffSender(token) as _sender,
):
recv_p = await an.start_actor(
'ring_blocked_receiver',
enable_modules=[__name__],
proc_kwargs={
'pass_fds': (token.write_eventfd, token.wrap_eventfd)
}
)
async with (
recv_p.open_context(
child_blocked_receiver,
token=token
) as (sctx, _sent),
):
await trio.sleep(1)
await an.cancel()
with pytest.raises(tractor._exceptions.ContextCancelled):
trio.run(main)
@tractor.context
async def child_blocked_sender(
ctx: tractor.Context,
token: RBToken
):
async with RingBuffSender(token) as sender:
await ctx.started()
await sender.send_all(b'this will wrap')
def test_ring_sender_cancel():
async def main():
with open_ringbuf(
'test_ring_cancel_sender',
buf_size=1
) as token:
async with tractor.open_nursery() as an:
recv_p = await an.start_actor(
'ring_blocked_sender',
enable_modules=[__name__],
proc_kwargs={
'pass_fds': (token.write_eventfd, token.wrap_eventfd)
}
)
async with (
recv_p.open_context(
child_blocked_sender,
token=token
) as (sctx, _sent),
):
await trio.sleep(1)
await an.cancel()
with pytest.raises(tractor._exceptions.ContextCancelled):
trio.run(main)

View File

@ -147,8 +147,7 @@ def test_trio_prestarted_task_bubbles(
await trio.sleep_forever()
async def _trio_main():
# with trio.fail_after(2):
with trio.fail_after(999):
with trio.fail_after(2 if not debug_mode else 999):
first: str
chan: to_asyncio.LinkedTaskChannel
aio_ev = asyncio.Event()
@ -217,32 +216,25 @@ def test_trio_prestarted_task_bubbles(
):
aio_ev.set()
with pytest.raises(
expected_exception=ExceptionGroup,
) as excinfo:
tractor.to_asyncio.run_as_asyncio_guest(
trio_main=_trio_main,
)
eg = excinfo.value
rte_eg, rest_eg = eg.split(RuntimeError)
# ensure the trio-task's error bubbled despite the aio-side
# having (maybe) errored first.
if aio_err_trigger in (
'after_trio_task_starts',
'after_start_point',
):
assert len(errs := rest_eg.exceptions) == 1
typerr = errs[0]
assert (
type(typerr) is TypeError
and
'trio-side' in typerr.args
)
patt: str = 'trio-side'
expect_exc = TypeError
# when aio errors BEFORE (last) trio task is scheduled, we should
# never see anythinb but the aio-side.
else:
assert len(rtes := rte_eg.exceptions) == 1
assert 'asyncio-side' in rtes[0].args[0]
patt: str = 'asyncio-side'
expect_exc = RuntimeError
with pytest.raises(expect_exc) as excinfo:
tractor.to_asyncio.run_as_asyncio_guest(
trio_main=_trio_main,
)
caught_exc = excinfo.value
assert patt in caught_exc.args

View File

@ -0,0 +1,108 @@
'''
Runtime boot/init sanity.
'''
import pytest
import trio
import tractor
from tractor._exceptions import RuntimeFailure
@tractor.context
async def open_new_root_in_sub(
ctx: tractor.Context,
) -> None:
async with tractor.open_root_actor():
pass
@pytest.mark.parametrize(
'open_root_in',
['root', 'sub'],
ids='open_2nd_root_in={}'.format,
)
def test_only_one_root_actor(
open_root_in: str,
reg_addr: tuple,
debug_mode: bool
):
'''
Verify we specially fail whenever more then one root actor
is attempted to be opened within an already opened tree.
'''
async def main():
async with tractor.open_nursery() as an:
if open_root_in == 'root':
async with tractor.open_root_actor(
registry_addrs=[reg_addr],
):
pass
ptl: tractor.Portal = await an.start_actor(
name='bad_rooty_boi',
enable_modules=[__name__],
)
async with ptl.open_context(
open_new_root_in_sub,
) as (ctx, first):
pass
if open_root_in == 'root':
with pytest.raises(
RuntimeFailure
) as excinfo:
trio.run(main)
else:
with pytest.raises(
tractor.RemoteActorError,
) as excinfo:
trio.run(main)
assert excinfo.value.boxed_type is RuntimeFailure
def test_implicit_root_via_first_nursery(
reg_addr: tuple,
debug_mode: bool
):
'''
The first `ActorNursery` open should implicitly call
`_root.open_root_actor()`.
'''
async def main():
async with tractor.open_nursery() as an:
assert an._implicit_runtime_started
assert tractor.current_actor().aid.name == 'root'
trio.run(main)
def test_runtime_vars_unset(
reg_addr: tuple,
debug_mode: bool
):
'''
Ensure any `._state._runtime_vars` are restored to default values
after the root actor-runtime exits!
'''
assert not tractor._state._runtime_vars['_debug_mode']
async def main():
assert not tractor._state._runtime_vars['_debug_mode']
async with tractor.open_nursery(
debug_mode=True,
):
assert tractor._state._runtime_vars['_debug_mode']
# after runtime closure, should be reverted!
assert not tractor._state._runtime_vars['_debug_mode']
trio.run(main)

View File

@ -8,7 +8,7 @@ import uuid
import pytest
import trio
import tractor
from tractor._shm import (
from tractor.ipc._shm import (
open_shm_list,
attach_shm_list,
)

View File

@ -2,6 +2,7 @@
Spawning basics
"""
from functools import partial
from typing import (
Any,
)
@ -12,74 +13,99 @@ import tractor
from tractor._testing import tractor_test
data_to_pass_down = {'doggy': 10, 'kitty': 4}
data_to_pass_down = {
'doggy': 10,
'kitty': 4,
}
async def spawn(
is_arbiter: bool,
should_be_root: bool,
data: dict,
reg_addr: tuple[str, int],
debug_mode: bool = False,
):
namespaces = [__name__]
await trio.sleep(0.1)
actor = tractor.current_actor(err_on_no_runtime=False)
async with tractor.open_root_actor(
arbiter_addr=reg_addr,
):
actor = tractor.current_actor()
assert actor.is_arbiter == is_arbiter
data = data_to_pass_down
if should_be_root:
assert actor is None # no runtime yet
async with (
tractor.open_root_actor(
arbiter_addr=reg_addr,
),
tractor.open_nursery() as an,
):
# now runtime exists
actor: tractor.Actor = tractor.current_actor()
assert actor.is_arbiter == should_be_root
if actor.is_arbiter:
async with tractor.open_nursery() as nursery:
# spawns subproc here
portal: tractor.Portal = await an.run_in_actor(
fn=spawn,
# forks here
portal = await nursery.run_in_actor(
spawn,
is_arbiter=False,
name='sub-actor',
data=data,
reg_addr=reg_addr,
enable_modules=namespaces,
)
# spawning args
name='sub-actor',
enable_modules=[__name__],
assert len(nursery._children) == 1
assert portal.channel.uid in tractor.current_actor()._peers
# be sure we can still get the result
result = await portal.result()
assert result == 10
return result
else:
return 10
# passed to a subactor-recursive RPC invoke
# of this same `spawn()` fn.
should_be_root=False,
data=data_to_pass_down,
reg_addr=reg_addr,
)
assert len(an._children) == 1
assert (
portal.channel.uid
in
tractor.current_actor().ipc_server._peers
)
# get result from child subactor
result = await portal.result()
assert result == 10
return result
else:
assert actor.is_arbiter == should_be_root
return 10
def test_local_arbiter_subactor_global_state(
reg_addr,
def test_run_in_actor_same_func_in_child(
reg_addr: tuple,
debug_mode: bool,
):
result = trio.run(
spawn,
True,
data_to_pass_down,
reg_addr,
partial(
spawn,
should_be_root=True,
data=data_to_pass_down,
reg_addr=reg_addr,
debug_mode=debug_mode,
)
)
assert result == 10
async def movie_theatre_question():
"""A question asked in a dark theatre, in a tangent
'''
A question asked in a dark theatre, in a tangent
(errr, I mean different) process.
"""
'''
return 'have you ever seen a portal?'
@tractor_test
async def test_movie_theatre_convo(start_method):
"""The main ``tractor`` routine.
"""
async with tractor.open_nursery() as n:
'''
The main ``tractor`` routine.
portal = await n.start_actor(
'''
async with tractor.open_nursery(debug_mode=True) as an:
portal = await an.start_actor(
'frank',
# enable the actor to run funcs from this current module
enable_modules=[__name__],
@ -118,8 +144,8 @@ async def test_most_beautiful_word(
with trio.fail_after(1):
async with tractor.open_nursery(
debug_mode=debug_mode,
) as n:
portal = await n.run_in_actor(
) as an:
portal = await an.run_in_actor(
cellar_door,
return_value=return_value,
name='some_linguist',

View File

@ -8,6 +8,7 @@ from contextlib import (
)
import pytest
from tractor.trionics import collapse_eg
import trio
from trio import TaskStatus
@ -64,9 +65,8 @@ def test_stashed_child_nursery(use_start_soon):
async def main():
async with (
trio.open_nursery(
strict_exception_groups=False,
) as pn,
collapse_eg(),
trio.open_nursery() as pn,
):
cn = await pn.start(mk_child_nursery)
assert cn
@ -112,55 +112,11 @@ def test_acm_embedded_nursery_propagates_enter_err(
'''
import tractor
@acm
async def maybe_raise_from_masking_exc(
tn: trio.Nursery,
unmask_from: BaseException|None = trio.Cancelled
# TODO, maybe offer a collection?
# unmask_from: set[BaseException] = {
# trio.Cancelled,
# },
):
if not unmask_from:
yield
return
try:
yield
except* unmask_from as be_eg:
# TODO, if we offer `unmask_from: set`
# for masker_exc_type in unmask_from:
matches, rest = be_eg.split(unmask_from)
if not matches:
raise
for exc_match in be_eg.exceptions:
if (
(exc_ctx := exc_match.__context__)
and
type(exc_ctx) not in {
# trio.Cancelled, # always by default?
unmask_from,
}
):
exc_ctx.add_note(
f'\n'
f'WARNING: the above error was masked by a {unmask_from!r} !?!\n'
f'Are you always cancelling? Say from a `finally:` ?\n\n'
f'{tn!r}'
)
raise exc_ctx from exc_match
@acm
async def wraps_tn_that_always_cancels():
async with (
trio.open_nursery() as tn,
maybe_raise_from_masking_exc(
tractor.trionics.maybe_raise_from_masking_exc(
tn=tn,
unmask_from=(
trio.Cancelled
@ -180,7 +136,8 @@ def test_acm_embedded_nursery_propagates_enter_err(
with tractor.devx.maybe_open_crash_handler(
pdb=debug_mode,
) as bxerr:
assert not bxerr.value
if bxerr:
assert not bxerr.value
async with (
wraps_tn_that_always_cancels() as tn,
@ -201,3 +158,58 @@ def test_acm_embedded_nursery_propagates_enter_err(
assert_eg, rest_eg = eg.split(AssertionError)
assert len(assert_eg.exceptions) == 1
def test_gatherctxs_with_memchan_breaks_multicancelled(
debug_mode: bool,
):
'''
Demo how a using an `async with sndchan` inside a `.trionics.gather_contexts()` task
will break a strict-eg-tn's multi-cancelled absorption..
'''
from tractor import (
trionics,
)
@acm
async def open_memchan() -> trio.abc.ReceiveChannel:
task: trio.Task = trio.lowlevel.current_task()
print(
f'Opening {task!r}\n'
)
# 1 to force eager sending
send, recv = trio.open_memory_channel(16)
try:
async with send:
yield recv
finally:
print(
f'Closed {task!r}\n'
)
async def main():
async with (
# XXX should ensure ONLY the KBI
# is relayed upward
collapse_eg(),
trio.open_nursery(), # as tn,
trionics.gather_contexts([
open_memchan(),
open_memchan(),
]) as recv_chans,
):
assert len(recv_chans) == 2
await trio.sleep(1)
raise KeyboardInterrupt
# tn.cancel_scope.cancel()
with pytest.raises(KeyboardInterrupt):
trio.run(main)

View File

@ -64,7 +64,7 @@ from ._root import (
run_daemon as run_daemon,
open_root_actor as open_root_actor,
)
from ._ipc import Channel as Channel
from .ipc import Channel as Channel
from ._portal import Portal as Portal
from ._runtime import Actor as Actor
# from . import hilevel as hilevel

282
tractor/_addr.py 100644
View File

@ -0,0 +1,282 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
from __future__ import annotations
from uuid import uuid4
from typing import (
Protocol,
ClassVar,
Type,
TYPE_CHECKING,
)
from bidict import bidict
from trio import (
SocketListener,
)
from .log import get_logger
from ._state import (
_def_tpt_proto,
)
from .ipc._tcp import TCPAddress
from .ipc._uds import UDSAddress
if TYPE_CHECKING:
from ._runtime import Actor
log = get_logger(__name__)
# TODO, maybe breakout the netns key to a struct?
# class NetNs(Struct)[str, int]:
# ...
# TODO, can't we just use a type alias
# for this? namely just some `tuple[str, int, str, str]`?
#
# -[ ] would also just be simpler to keep this as SockAddr[tuple]
# or something, implying it's just a simple pair of values which can
# presumably be mapped to all transports?
# -[ ] `pydoc socket.socket.getsockname()` delivers a 4-tuple for
# ipv6 `(hostaddr, port, flowinfo, scope_id)`.. so how should we
# handle that?
# -[ ] as a further alternative to this wrap()/unwrap() approach we
# could just implement `enc/dec_hook()`s for the `Address`-types
# and just deal with our internal objs directly and always and
# leave it to the codec layer to figure out marshalling?
# |_ would mean only one spot to do the `.unwrap()` (which we may
# end up needing to call from the hook()s anyway?)
# -[x] rename to `UnwrappedAddress[Descriptor]` ??
# seems like the right name as per,
# https://www.geeksforgeeks.org/introduction-to-address-descriptor/
#
UnwrappedAddress = (
# tcp/udp/uds
tuple[
str, # host/domain(tcp), filesys-dir(uds)
int|str, # port/path(uds)
]
# ?TODO? should we also include another 2 fields from
# our `Aid` msg such that we include the runtime `Actor.uid`
# of `.name` and `.uuid`?
# - would ensure uniqueness across entire net?
# - allows for easier runtime-level filtering of "actors by
# service name"
)
# TODO, maybe rename to `SocketAddress`?
class Address(Protocol):
proto_key: ClassVar[str]
unwrapped_type: ClassVar[UnwrappedAddress]
# TODO, i feel like an `.is_bound()` is a better thing to
# support?
# Lke, what use does this have besides a noop and if it's not
# valid why aren't we erroring on creation/use?
@property
def is_valid(self) -> bool:
...
# TODO, maybe `.netns` is a better name?
@property
def namespace(self) -> tuple[str, int]|None:
'''
The if-available, OS-specific "network namespace" key.
'''
...
@property
def bindspace(self) -> str:
'''
Deliver the socket address' "bindable space" from
a `socket.socket.bind()` and thus from the perspective of
specific transport protocol domain.
I.e. for most (layer-4) network-socket protocols this is
normally the ipv4/6 address, for UDS this is normally
a filesystem (sub-directory).
For (distributed) network protocols this is normally the routing
layer's domain/(ip-)address, though it might also include a "network namespace"
key different then the default.
For local-host-only transports this is either an explicit
namespace (with types defined by the OS: netns, Cgroup, IPC,
pid, etc. on linux) or failing that the sub-directory in the
filesys in which socket/shm files are located *under*.
'''
...
@classmethod
def from_addr(cls, addr: UnwrappedAddress) -> Address:
...
def unwrap(self) -> UnwrappedAddress:
'''
Deliver the underying minimum field set in
a primitive python data type-structure.
'''
...
@classmethod
def get_random(
cls,
current_actor: Actor,
bindspace: str|None = None,
) -> Address:
...
# TODO, this should be something like a `.get_def_registar_addr()`
# or similar since,
# - it should be a **host singleton** (not root/tree singleton)
# - we **only need this value** when one isn't provided to the
# runtime at boot and we want to implicitly provide a host-wide
# registrar.
# - each rooted-actor-tree should likely have its own
# micro-registry (likely the root being it), also see
@classmethod
def get_root(cls) -> Address:
...
def __repr__(self) -> str:
...
def __eq__(self, other) -> bool:
...
async def open_listener(
self,
**kwargs,
) -> SocketListener:
...
async def close_listener(self):
...
_address_types: bidict[str, Type[Address]] = {
'tcp': TCPAddress,
'uds': UDSAddress
}
# TODO! really these are discovery sys default addrs ONLY useful for
# when none is provided to a root actor on first boot.
_default_lo_addrs: dict[
str,
UnwrappedAddress
] = {
'tcp': TCPAddress.get_root().unwrap(),
'uds': UDSAddress.get_root().unwrap(),
}
def get_address_cls(name: str) -> Type[Address]:
return _address_types[name]
def is_wrapped_addr(addr: any) -> bool:
return type(addr) in _address_types.values()
def mk_uuid() -> str:
'''
Encapsulate creation of a uuid4 as `str` as used
for creating `Actor.uid: tuple[str, str]` and/or
`.msg.types.Aid`.
'''
return str(uuid4())
def wrap_address(
addr: UnwrappedAddress
) -> Address:
'''
Wrap an `UnwrappedAddress` as an `Address`-type based
on matching builtin python data-structures which we adhoc
use for each.
XXX NOTE, careful care must be placed to ensure
`UnwrappedAddress` cases are **definitely unique** otherwise the
wrong transport backend may be loaded and will break many
low-level things in our runtime in a not-fun-to-debug way!
XD
'''
if is_wrapped_addr(addr):
return addr
cls: Type|None = None
# if 'sock' in addr[0]:
# import pdbp; pdbp.set_trace()
match addr:
# classic network socket-address as tuple/list
case (
(str(), int())
|
[str(), int()]
):
cls = TCPAddress
case (
# (str()|Path(), str()|Path()),
# ^TODO? uhh why doesn't this work!?
(_, filename)
) if type(filename) is str:
cls = UDSAddress
# likely an unset UDS or TCP reg address as defaulted in
# `_state._runtime_vars['_root_mailbox']`
#
# TODO? figure out when/if we even need this?
case (
None
|
[None, None]
):
cls: Type[Address] = get_address_cls(_def_tpt_proto)
addr: UnwrappedAddress = cls.get_root().unwrap()
case _:
# import pdbp; pdbp.set_trace()
raise TypeError(
f'Can not wrap unwrapped-address ??\n'
f'type(addr): {type(addr)!r}\n'
f'addr: {addr!r}\n'
)
return cls.from_addr(addr)
def default_lo_addrs(
transports: list[str],
) -> list[Type[Address]]:
'''
Return the default, host-singleton, registry address
for an input transport key set.
'''
return [
_default_lo_addrs[transport]
for transport in transports
]

View File

@ -31,8 +31,12 @@ def parse_uid(arg):
return str(name), str(uuid) # ensures str encoding
def parse_ipaddr(arg):
host, port = literal_eval(arg)
return (str(host), int(port))
try:
return literal_eval(arg)
except (ValueError, SyntaxError):
# UDS: try to interpret as a straight up str
return arg
if __name__ == "__main__":
@ -46,8 +50,8 @@ if __name__ == "__main__":
args = parser.parse_args()
subactor = Actor(
args.uid[0],
uid=args.uid[1],
name=args.uid[0],
uuid=args.uid[1],
loglevel=args.loglevel,
spawn_method="trio"
)

View File

@ -55,10 +55,17 @@ async def open_actor_cluster(
raise ValueError(
'Number of names is {len(names)} but count it {count}')
async with tractor.open_nursery(
**runtime_kwargs,
) as an:
async with trio.open_nursery() as n:
async with (
# tractor.trionics.collapse_eg(),
tractor.open_nursery(
**runtime_kwargs,
) as an
):
async with (
# tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
tractor.trionics.maybe_raise_from_masking_exc()
):
uid = tractor.current_actor().uid
async def _start(name: str) -> None:
@ -69,9 +76,8 @@ async def open_actor_cluster(
)
for name in names:
n.start_soon(_start, name)
tn.start_soon(_start, name)
assert len(portals) == count
yield portals
await an.cancel(hard_kill=hard_kill)

View File

@ -89,7 +89,7 @@ from .msg import (
pretty_struct,
_ops as msgops,
)
from ._ipc import (
from .ipc import (
Channel,
)
from ._streaming import (
@ -101,11 +101,14 @@ from ._state import (
debug_mode,
_ctxvar_Context,
)
from .trionics import (
collapse_eg,
)
# ------ - ------
if TYPE_CHECKING:
from ._portal import Portal
from ._runtime import Actor
from ._ipc import MsgTransport
from .ipc._transport import MsgTransport
from .devx._frame_stack import (
CallerInfo,
)
@ -151,7 +154,7 @@ class Context:
2 cancel-scope-linked, communicating and parallel executing
`Task`s. Contexts are allocated on each side of any task
RPC-linked msg dialog, i.e. for every request to a remote
actor from a `Portal`. On the "callee" side a context is
actor from a `Portal`. On the "child" side a context is
always allocated inside `._rpc._invoke()`.
TODO: more detailed writeup on cancellation, error and
@ -219,8 +222,8 @@ class Context:
# `._runtime.invoke()`.
_remote_func_type: str | None = None
# NOTE: (for now) only set (a portal) on the caller side since
# the callee doesn't generally need a ref to one and should
# NOTE: (for now) only set (a portal) on the parent side since
# the child doesn't generally need a ref to one and should
# normally need to explicitly ask for handle to its peer if
# more the the `Context` is needed?
_portal: Portal | None = None
@ -249,12 +252,12 @@ class Context:
_outcome_msg: Return|Error|ContextCancelled = Unresolved
# on a clean exit there should be a final value
# delivered from the far end "callee" task, so
# delivered from the far end "child" task, so
# this value is only set on one side.
# _result: Any | int = None
_result: PayloadT|Unresolved = Unresolved
# if the local "caller" task errors this value is always set
# if the local "parent" task errors this value is always set
# to the error that was captured in the
# `Portal.open_context().__aexit__()` teardown block OR, in
# 2 special cases when an (maybe) expected remote error
@ -290,9 +293,9 @@ class Context:
# a `ContextCancelled` due to a call to `.cancel()` triggering
# "graceful closure" on either side:
# - `._runtime._invoke()` will check this flag before engaging
# the crash handler REPL in such cases where the "callee"
# the crash handler REPL in such cases where the "child"
# raises the cancellation,
# - `.devx._debug.lock_stdio_for_peer()` will set it to `False` if
# - `.devx.debug.lock_stdio_for_peer()` will set it to `False` if
# the global tty-lock has been configured to filter out some
# actors from being able to acquire the debugger lock.
_enter_debugger_on_cancel: bool = True
@ -304,8 +307,8 @@ class Context:
_stream_opened: bool = False
_stream: MsgStream|None = None
# caller of `Portal.open_context()` for
# logging purposes mostly
# the parent-task's calling-fn's frame-info, the frame above
# `Portal.open_context()`, for introspection/logging.
_caller_info: CallerInfo|None = None
# overrun handling machinery
@ -366,7 +369,7 @@ class Context:
# f' ---\n'
f' |_ipc: {self.dst_maddr}\n'
# f' dst_maddr{ds}{self.dst_maddr}\n'
f" uid{ds}'{self.chan.uid}'\n"
f" uid{ds}'{self.chan.aid}'\n"
f" cid{ds}'{self.cid}'\n"
# f' ---\n'
f'\n'
@ -526,11 +529,11 @@ class Context:
'''
Exactly the value of `self._scope.cancelled_caught`
(delegation) and should only be (able to be read as)
`True` for a `.side == "caller"` ctx wherein the
`True` for a `.side == "parent"` ctx wherein the
`Portal.open_context()` block was exited due to a call to
`._scope.cancel()` - which should only ocurr in 2 cases:
- a caller side calls `.cancel()`, the far side cancels
- a parent side calls `.cancel()`, the far side cancels
and delivers back a `ContextCancelled` (making
`.cancel_acked == True`) and `._scope.cancel()` is
called by `._maybe_cancel_and_set_remote_error()` which
@ -539,20 +542,20 @@ class Context:
=> `._scope.cancelled_caught == True` by normal `trio`
cs semantics.
- a caller side is delivered a `._remote_error:
- a parent side is delivered a `._remote_error:
RemoteActorError` via `._deliver_msg()` and a transitive
call to `_maybe_cancel_and_set_remote_error()` calls
`._scope.cancel()` and that cancellation eventually
results in `trio.Cancelled`(s) caught in the
`.open_context()` handling around the @acm's `yield`.
Only as an FYI, in the "callee" side case it can also be
Only as an FYI, in the "child" side case it can also be
set but never is readable by any task outside the RPC
machinery in `._invoke()` since,:
- when a callee side calls `.cancel()`, `._scope.cancel()`
- when a child side calls `.cancel()`, `._scope.cancel()`
is called immediately and handled specially inside
`._invoke()` to raise a `ContextCancelled` which is then
sent to the caller side.
sent to the parent side.
However, `._scope.cancelled_caught` can NEVER be
accessed/read as `True` by any RPC invoked task since it
@ -663,7 +666,7 @@ class Context:
when called/closed by actor local task(s).
NOTEs:
- It is expected that the caller has previously unwrapped
- It is expected that the parent has previously unwrapped
the remote error using a call to `unpack_error()` and
provides that output exception value as the input
`error` argument *here*.
@ -673,7 +676,7 @@ class Context:
`Portal.open_context()` (ideally) we want to interrupt
any ongoing local tasks operating within that
`Context`'s cancel-scope so as to be notified ASAP of
the remote error and engage any caller handling (eg.
the remote error and engage any parent handling (eg.
for cross-process task supervision).
- In some cases we may want to raise the remote error
@ -740,6 +743,8 @@ class Context:
# cancelled, NOT their reported canceller. IOW in the
# latter case we're cancelled by someone else getting
# cancelled.
#
# !TODO, switching to `Actor.aid` here!
if (canc := error.canceller) == self._actor.uid:
whom: str = 'us'
self._canceller = canc
@ -859,19 +864,10 @@ class Context:
@property
def dst_maddr(self) -> str:
chan: Channel = self.chan
dst_addr, dst_port = chan.raddr
trans: MsgTransport = chan.transport
# cid: str = self.cid
# cid_head, cid_tail = cid[:6], cid[-6:]
return (
f'/ipv4/{dst_addr}'
f'/{trans.name_key}/{dst_port}'
# f'/{self.chan.uid[0]}'
# f'/{self.cid}'
# f'/cid={cid_head}..{cid_tail}'
# TODO: ? not use this ^ right ?
)
return trans.maddr
dmaddr = dst_maddr
@ -890,6 +886,11 @@ class Context:
@property
def repr_caller(self) -> str:
'''
Render a "namespace-path" style representation of the calling
task-fn.
'''
ci: CallerInfo|None = self._caller_info
if ci:
return (
@ -903,7 +904,7 @@ class Context:
def repr_api(self) -> str:
return 'Portal.open_context()'
# TODO: use `.dev._frame_stack` scanning to find caller!
# TODO: use `.dev._frame_stack` scanning to find caller fn!
# ci: CallerInfo|None = self._caller_info
# if ci:
# return (
@ -938,7 +939,7 @@ class Context:
=> That is, an IPC `Context` (this) **does not**
have the same semantics as a `trio.CancelScope`.
If the caller (who entered the `Portal.open_context()`)
If the parent (who entered the `Portal.open_context()`)
desires that the internal block's cancel-scope be
cancelled it should open its own `trio.CancelScope` and
manage it as needed.
@ -949,15 +950,15 @@ class Context:
self.cancel_called = True
header: str = (
f'Cancelling ctx from {side.upper()}-side\n'
f'Cancelling ctx from {side!r}-side\n'
)
reminfo: str = (
# ' =>\n'
# f'Context.cancel() => {self.chan.uid}\n'
f'\n'
f'c)=> {self.chan.uid}\n'
# f'{self.chan.uid}\n'
f' |_ @{self.dst_maddr}\n'
f' >> {self.repr_rpc}\n'
f' |_[{self.dst_maddr}\n'
f' >> {self.repr_rpc}\n'
# f' >> {self._nsf}() -> {codec}[dict]:\n\n'
# TODO: pull msg-type from spec re #320
)
@ -1010,7 +1011,6 @@ class Context:
else:
log.cancel(
f'Timed out on cancel request of remote task?\n'
f'\n'
f'{reminfo}'
)
@ -1021,7 +1021,7 @@ class Context:
# `_invoke()` RPC task.
#
# NOTE: on this side we ALWAYS cancel the local scope
# since the caller expects a `ContextCancelled` to be sent
# since the parent expects a `ContextCancelled` to be sent
# from `._runtime._invoke()` back to the other side. The
# logic for catching the result of the below
# `._scope.cancel()` is inside the `._runtime._invoke()`
@ -1078,9 +1078,25 @@ class Context:
|RemoteActorError # stream overrun caused and ignored by us
):
'''
Maybe raise a remote error depending on the type of error
and *who* (i.e. which task from which actor) requested
a cancellation (if any).
Maybe raise a remote error depending on the type of error and
*who*, i.e. which side of the task pair across actors,
requested a cancellation (if any).
Depending on the input config-params suppress raising
certain remote excs:
- if `remote_error: ContextCancelled` (ctxc) AND this side's
task is the "requester", it at somem point called
`Context.cancel()`, then the peer's ctxc is treated
as a "cancel ack".
|_ this behaves exactly like how `trio.Nursery.cancel_scope`
absorbs any `BaseExceptionGroup[trio.Cancelled]` wherein the
owning parent task never will raise a `trio.Cancelled`
if `CancelScope.cancel_called == True`.
- `remote_error: StreamOverrrun` (overrun) AND
`raise_overrun_from_self` is set.
'''
__tracebackhide__: bool = hide_tb
@ -1122,18 +1138,19 @@ class Context:
# for this ^, NO right?
) or (
# NOTE: whenever this context is the cause of an
# overrun on the remote side (aka we sent msgs too
# fast that the remote task was overrun according
# to `MsgStream` buffer settings) AND the caller
# has requested to not raise overruns this side
# caused, we also silently absorb any remotely
# boxed `StreamOverrun`. This is mostly useful for
# supressing such faults during
# cancellation/error/final-result handling inside
# `msg._ops.drain_to_final_msg()` such that we do not
# raise such errors particularly in the case where
# NOTE: whenever this side is the cause of an
# overrun on the peer side, i.e. we sent msgs too
# fast and the peer task was overrun according
# to `MsgStream` buffer settings, AND this was
# called with `raise_overrun_from_self=True` (the
# default), silently absorb any `StreamOverrun`.
#
# XXX, this is namely useful for supressing such faults
# during cancellation/error/final-result handling inside
# `.msg._ops.drain_to_final_msg()` such that we do not
# raise during a cancellation-request, i.e. when
# `._cancel_called == True`.
#
not raise_overrun_from_self
and isinstance(remote_error, RemoteActorError)
and remote_error.boxed_type is StreamOverrun
@ -1177,8 +1194,8 @@ class Context:
) -> Any|Exception:
'''
From some (caller) side task, wait for and return the final
result from the remote (callee) side's task.
From some (parent) side task, wait for and return the final
result from the remote (child) side's task.
This provides a mechanism for one task running in some actor to wait
on another task at the other side, in some other actor, to terminate.
@ -1243,8 +1260,8 @@ class Context:
# ?XXX, should already be set in `._deliver_msg()` right?
if self._outcome_msg is not Unresolved:
# from .devx import _debug
# await _debug.pause()
# from .devx import debug
# await debug.pause()
assert self._outcome_msg is outcome_msg
else:
self._outcome_msg = outcome_msg
@ -1474,6 +1491,12 @@ class Context:
):
status = 'peer-cancelled'
case (
Unresolved,
trio.Cancelled(), # any error-type
) if self.canceller:
status = 'actor-cancelled'
# (remote) error condition
case (
Unresolved,
@ -1587,7 +1610,7 @@ class Context:
raise err
# TODO: maybe a flag to by-pass encode op if already done
# here in caller?
# here in parent?
await self.chan.send(started_msg)
# set msg-related internal runtime-state
@ -1663,7 +1686,7 @@ class Context:
XXX RULES XXX
------ - ------
- NEVER raise remote errors from this method; a runtime task caller.
- NEVER raise remote errors from this method; a calling runtime-task.
An error "delivered" to a ctx should always be raised by
the corresponding local task operating on the
`Portal`/`Context` APIs.
@ -1739,7 +1762,7 @@ class Context:
else:
report = (
'Queueing OVERRUN msg on caller task:\n\n'
'Queueing OVERRUN msg on parent task:\n\n'
+ report
)
log.debug(report)
@ -1935,12 +1958,12 @@ async def open_context_from_portal(
IPC protocol.
The yielded `tuple` is a pair delivering a `tractor.Context`
and any first value "sent" by the "callee" task via a call
and any first value "sent" by the "child" task via a call
to `Context.started(<value: Any>)`; this side of the
context does not unblock until the "callee" task calls
context does not unblock until the "child" task calls
`.started()` in similar style to `trio.Nursery.start()`.
When the "callee" (side that is "called"/started by a call
to *this* method) returns, the caller side (this) unblocks
When the "child" (side that is "called"/started by a call
to *this* method) returns, the parent side (this) unblocks
and any final value delivered from the other end can be
retrieved using the `Contex.wait_for_result()` api.
@ -1953,7 +1976,7 @@ async def open_context_from_portal(
__tracebackhide__: bool = hide_tb
# denote this frame as a "runtime frame" for stack
# introspection where we report the caller code in logging
# introspection where we report the parent code in logging
# and error message content.
# NOTE: 2 bc of the wrapping `@acm`
__runtimeframe__: int = 2 # noqa
@ -2012,13 +2035,11 @@ async def open_context_from_portal(
# placeholder for any exception raised in the runtime
# or by user tasks which cause this context's closure.
scope_err: BaseException|None = None
ctxc_from_callee: ContextCancelled|None = None
ctxc_from_child: ContextCancelled|None = None
try:
async with (
trio.open_nursery(
strict_exception_groups=False,
) as tn,
collapse_eg(),
trio.open_nursery() as tn,
msgops.maybe_limit_plds(
ctx=ctx,
spec=ctx_meta.get('pld_spec'),
@ -2093,7 +2114,7 @@ async def open_context_from_portal(
# that we can re-use it around the `yield` ^ here
# or vice versa?
#
# maybe TODO NOTE: between the caller exiting and
# maybe TODO NOTE: between the parent exiting and
# arriving here the far end may have sent a ctxc-msg or
# other error, so the quetion is whether we should check
# for it here immediately and maybe raise so as to engage
@ -2159,16 +2180,16 @@ async def open_context_from_portal(
# request in which case we DO let the error bubble to the
# opener.
#
# 2-THIS "caller" task somewhere invoked `Context.cancel()`
# and received a `ContextCanclled` from the "callee"
# 2-THIS "parent" task somewhere invoked `Context.cancel()`
# and received a `ContextCanclled` from the "child"
# task, in which case we mask the `ContextCancelled` from
# bubbling to this "caller" (much like how `trio.Nursery`
# bubbling to this "parent" (much like how `trio.Nursery`
# swallows any `trio.Cancelled` bubbled by a call to
# `Nursery.cancel_scope.cancel()`)
except ContextCancelled as ctxc:
scope_err = ctxc
ctx._local_error: BaseException = scope_err
ctxc_from_callee = ctxc
ctxc_from_child = ctxc
# XXX TODO XXX: FIX THIS debug_mode BUGGGG!!!
# using this code and then resuming the REPL will
@ -2179,7 +2200,7 @@ async def open_context_from_portal(
# debugging the tractor-runtime itself using it's
# own `.devx.` tooling!
#
# await _debug.pause()
# await debug.pause()
# CASE 2: context was cancelled by local task calling
# `.cancel()`, we don't raise and the exit block should
@ -2205,11 +2226,11 @@ async def open_context_from_portal(
# the above `._scope` can be cancelled due to:
# 1. an explicit self cancel via `Context.cancel()` or
# `Actor.cancel()`,
# 2. any "callee"-side remote error, possibly also a cancellation
# 2. any "child"-side remote error, possibly also a cancellation
# request by some peer,
# 3. any "caller" (aka THIS scope's) local error raised in the above `yield`
# 3. any "parent" (aka THIS scope's) local error raised in the above `yield`
except (
# CASE 3: standard local error in this caller/yieldee
# CASE 3: standard local error in this parent/yieldee
Exception,
# CASES 1 & 2: can manifest as a `ctx._scope_nursery`
@ -2223,9 +2244,9 @@ async def open_context_from_portal(
# any `Context._maybe_raise_remote_err()` call.
#
# 2.-`BaseExceptionGroup[ContextCancelled | RemoteActorError]`
# from any error delivered from the "callee" side
# from any error delivered from the "child" side
# AND a group-exc is only raised if there was > 1
# tasks started *here* in the "caller" / opener
# tasks started *here* in the "parent" / opener
# block. If any one of those tasks calls
# `.wait_for_result()` or `MsgStream.receive()`
# `._maybe_raise_remote_err()` will be transitively
@ -2238,18 +2259,18 @@ async def open_context_from_portal(
trio.Cancelled, # NOTE: NOT from inside the ctx._scope
KeyboardInterrupt,
) as caller_err:
scope_err = caller_err
) as rent_err:
scope_err = rent_err
ctx._local_error: BaseException = scope_err
# XXX: ALWAYS request the context to CANCEL ON any ERROR.
# NOTE: `Context.cancel()` is conversely NEVER CALLED in
# the `ContextCancelled` "self cancellation absorbed" case
# handled in the block above ^^^ !!
# await _debug.pause()
# await debug.pause()
# log.cancel(
match scope_err:
case trio.Cancelled:
case trio.Cancelled():
logmeth = log.cancel
# XXX explicitly report on any non-graceful-taskc cases
@ -2257,15 +2278,15 @@ async def open_context_from_portal(
logmeth = log.exception
logmeth(
f'ctx {ctx.side!r}-side exited with {ctx.repr_outcome()}\n'
f'ctx {ctx.side!r}-side exited with {ctx.repr_outcome()!r}\n'
)
if debug_mode():
# async with _debug.acquire_debug_lock(portal.actor.uid):
# async with debug.acquire_debug_lock(portal.actor.uid):
# pass
# TODO: factor ^ into below for non-root cases?
#
from .devx._debug import maybe_wait_for_debugger
from .devx.debug import maybe_wait_for_debugger
was_acquired: bool = await maybe_wait_for_debugger(
# header_msg=(
# 'Delaying `ctx.cancel()` until debug lock '
@ -2278,9 +2299,9 @@ async def open_context_from_portal(
'Calling `ctx.cancel()`!\n'
)
# we don't need to cancel the callee if it already
# we don't need to cancel the child if it already
# told us it's cancelled ;p
if ctxc_from_callee is None:
if ctxc_from_child is None:
try:
await ctx.cancel()
except (
@ -2311,8 +2332,8 @@ async def open_context_from_portal(
# via a call to
# `Context._maybe_cancel_and_set_remote_error()`.
# As per `Context._deliver_msg()`, that error IS
# ALWAYS SET any time "callee" side fails and causes "caller
# side" cancellation via a `ContextCancelled` here.
# ALWAYS SET any time "child" side fails and causes
# "parent side" cancellation via a `ContextCancelled` here.
try:
result_or_err: Exception|Any = await ctx.wait_for_result()
except BaseException as berr:
@ -2328,8 +2349,8 @@ async def open_context_from_portal(
raise
# yes this worx!
# from .devx import _debug
# await _debug.pause()
# from .devx import debug
# await debug.pause()
# an exception type boxed in a `RemoteActorError`
# is returned (meaning it was obvi not raised)
@ -2348,7 +2369,7 @@ async def open_context_from_portal(
)
case (None, _):
log.runtime(
'Context returned final result from callee task:\n'
'Context returned final result from child task:\n'
f'<= peer: {uid}\n'
f' |_ {nsf}()\n\n'
@ -2364,7 +2385,7 @@ async def open_context_from_portal(
# where the root is waiting on the lock to clear but the
# child has already cleared it and clobbered IPC.
if debug_mode():
from .devx._debug import maybe_wait_for_debugger
from .devx.debug import maybe_wait_for_debugger
await maybe_wait_for_debugger()
# though it should be impossible for any tasks
@ -2443,7 +2464,7 @@ async def open_context_from_portal(
)
# TODO: should we add a `._cancel_req_received`
# flag to determine if the callee manually called
# flag to determine if the child manually called
# `ctx.cancel()`?
# -[ ] going to need a cid check no?
@ -2499,7 +2520,7 @@ def mk_context(
recv_chan: trio.MemoryReceiveChannel
send_chan, recv_chan = trio.open_memory_channel(msg_buffer_size)
# TODO: only scan caller-info if log level so high!
# TODO: only scan parent-info if log level so high!
from .devx._frame_stack import find_caller_info
caller_info: CallerInfo|None = find_caller_info()

View File

@ -28,8 +28,16 @@ from typing import (
from contextlib import asynccontextmanager as acm
from tractor.log import get_logger
from .trionics import gather_contexts
from ._ipc import _connect_chan, Channel
from .trionics import (
gather_contexts,
collapse_eg,
)
from .ipc import _connect_chan, Channel
from ._addr import (
UnwrappedAddress,
Address,
wrap_address
)
from ._portal import (
Portal,
open_portal,
@ -38,6 +46,7 @@ from ._portal import (
from ._state import (
current_actor,
_runtime_vars,
_def_tpt_proto,
)
if TYPE_CHECKING:
@ -49,9 +58,7 @@ log = get_logger(__name__)
@acm
async def get_registry(
host: str,
port: int,
addr: UnwrappedAddress|None = None,
) -> AsyncGenerator[
Portal | LocalPortal | None,
None,
@ -69,19 +76,20 @@ async def get_registry(
# (likely a re-entrant call from the arbiter actor)
yield LocalPortal(
actor,
Channel((host, port))
Channel(transport=None)
# ^XXX, we DO NOT actually provide nor connect an
# underlying transport since this is merely an API shim.
)
else:
# TODO: try to look pre-existing connection from
# `Actor._peers` and use it instead?
# `Server._peers` and use it instead?
async with (
_connect_chan(host, port) as chan,
_connect_chan(addr) as chan,
open_portal(chan) as regstr_ptl,
):
yield regstr_ptl
@acm
async def get_root(
**kwargs,
@ -89,11 +97,10 @@ async def get_root(
# TODO: rename mailbox to `_root_maddr` when we finally
# add and impl libp2p multi-addrs?
host, port = _runtime_vars['_root_mailbox']
assert host is not None
addr = _runtime_vars['_root_mailbox']
async with (
_connect_chan(host, port) as chan,
_connect_chan(addr) as chan,
open_portal(chan, **kwargs) as portal,
):
yield portal
@ -106,17 +113,23 @@ def get_peer_by_name(
) -> list[Channel]|None: # at least 1
'''
Scan for an existing connection (set) to a named actor
and return any channels from `Actor._peers`.
and return any channels from `Server._peers: dict`.
This is an optimization method over querying the registrar for
the same info.
'''
actor: Actor = current_actor()
to_scan: dict[tuple, list[Channel]] = actor._peers.copy()
pchan: Channel|None = actor._parent_chan
if pchan:
to_scan[pchan.uid].append(pchan)
to_scan: dict[tuple, list[Channel]] = actor.ipc_server._peers.copy()
# TODO: is this ever needed? creates a duplicate channel on actor._peers
# when multiple find_actor calls are made to same actor from a single ctx
# which causes actor exit to hang waiting forever on
# `actor._no_more_peers.wait()` in `_runtime.async_main`
# pchan: Channel|None = actor._parent_chan
# if pchan and pchan.uid not in to_scan:
# to_scan[pchan.uid].append(pchan)
for aid, chans in to_scan.items():
_, peer_name = aid
@ -134,10 +147,10 @@ def get_peer_by_name(
@acm
async def query_actor(
name: str,
regaddr: tuple[str, int]|None = None,
regaddr: UnwrappedAddress|None = None,
) -> AsyncGenerator[
tuple[str, int]|None,
UnwrappedAddress|None,
None,
]:
'''
@ -163,31 +176,31 @@ async def query_actor(
return
reg_portal: Portal
regaddr: tuple[str, int] = regaddr or actor.reg_addrs[0]
async with get_registry(*regaddr) as reg_portal:
regaddr: Address = wrap_address(regaddr) or actor.reg_addrs[0]
async with get_registry(regaddr) as reg_portal:
# TODO: return portals to all available actors - for now
# just the last one that registered
sockaddr: tuple[str, int] = await reg_portal.run_from_ns(
addr: UnwrappedAddress = await reg_portal.run_from_ns(
'self',
'find_actor',
name=name,
)
yield sockaddr
yield addr
@acm
async def maybe_open_portal(
addr: tuple[str, int],
addr: UnwrappedAddress,
name: str,
):
async with query_actor(
name=name,
regaddr=addr,
) as sockaddr:
) as addr:
pass
if sockaddr:
async with _connect_chan(*sockaddr) as chan:
if addr:
async with _connect_chan(addr) as chan:
async with open_portal(chan) as portal:
yield portal
else:
@ -197,7 +210,8 @@ async def maybe_open_portal(
@acm
async def find_actor(
name: str,
registry_addrs: list[tuple[str, int]]|None = None,
registry_addrs: list[UnwrappedAddress]|None = None,
enable_transports: list[str] = [_def_tpt_proto],
only_first: bool = True,
raise_on_none: bool = False,
@ -224,15 +238,15 @@ async def find_actor(
# XXX NOTE: make sure to dynamically read the value on
# every call since something may change it globally (eg.
# like in our discovery test suite)!
from . import _root
from ._addr import default_lo_addrs
registry_addrs = (
_runtime_vars['_registry_addrs']
or
_root._default_lo_addrs
default_lo_addrs(enable_transports)
)
maybe_portals: list[
AsyncContextManager[tuple[str, int]]
AsyncContextManager[UnwrappedAddress]
] = list(
maybe_open_portal(
addr=addr,
@ -241,9 +255,12 @@ async def find_actor(
for addr in registry_addrs
)
portals: list[Portal]
async with gather_contexts(
mngrs=maybe_portals,
) as portals:
async with (
collapse_eg(),
gather_contexts(
mngrs=maybe_portals,
) as portals,
):
# log.runtime(
# 'Gathered portals:\n'
# f'{portals}'
@ -274,7 +291,7 @@ async def find_actor(
@acm
async def wait_for_actor(
name: str,
registry_addr: tuple[str, int] | None = None,
registry_addr: UnwrappedAddress | None = None,
) -> AsyncGenerator[Portal, None]:
'''
@ -291,7 +308,7 @@ async def wait_for_actor(
yield peer_portal
return
regaddr: tuple[str, int] = (
regaddr: UnwrappedAddress = (
registry_addr
or
actor.reg_addrs[0]
@ -299,8 +316,8 @@ async def wait_for_actor(
# TODO: use `.trionics.gather_contexts()` like
# above in `find_actor()` as well?
reg_portal: Portal
async with get_registry(*regaddr) as reg_portal:
sockaddrs = await reg_portal.run_from_ns(
async with get_registry(regaddr) as reg_portal:
addrs = await reg_portal.run_from_ns(
'self',
'wait_for_actor',
name=name,
@ -308,8 +325,8 @@ async def wait_for_actor(
# get latest registered addr by default?
# TODO: offer multi-portal yields in multi-homed case?
sockaddr: tuple[str, int] = sockaddrs[-1]
addr: UnwrappedAddress = addrs[-1]
async with _connect_chan(*sockaddr) as chan:
async with _connect_chan(addr) as chan:
async with open_portal(chan) as portal:
yield portal

View File

@ -21,8 +21,7 @@ Sub-process entry points.
from __future__ import annotations
from functools import partial
import multiprocessing as mp
import os
import textwrap
# import os
from typing import (
Any,
TYPE_CHECKING,
@ -35,8 +34,13 @@ from .log import (
get_logger,
)
from . import _state
from .devx import _debug
from .devx import (
_frame_stack,
pformat,
)
# from .msg import pretty_struct
from .to_asyncio import run_as_asyncio_guest
from ._addr import UnwrappedAddress
from ._runtime import (
async_main,
Actor,
@ -52,10 +56,10 @@ log = get_logger(__name__)
def _mp_main(
actor: Actor,
accept_addrs: list[tuple[str, int]],
accept_addrs: list[UnwrappedAddress],
forkserver_info: tuple[Any, Any, Any, Any, Any],
start_method: SpawnMethodKey,
parent_addr: tuple[str, int] | None = None,
parent_addr: UnwrappedAddress | None = None,
infect_asyncio: bool = False,
) -> None:
@ -102,111 +106,10 @@ def _mp_main(
)
# TODO: move this func to some kinda `.devx._conc_lang.py` eventually
# as we work out our multi-domain state-flow-syntax!
def nest_from_op(
input_op: str,
#
# ?TODO? an idea for a syntax to the state of concurrent systems
# as a "3-domain" (execution, scope, storage) model and using
# a minimal ascii/utf-8 operator-set.
#
# try not to take any of this seriously yet XD
#
# > is a "play operator" indicating (CPU bound)
# exec/work/ops required at the "lowest level computing"
#
# execution primititves (tasks, threads, actors..) denote their
# lifetime with '(' and ')' since parentheses normally are used
# in many langs to denote function calls.
#
# starting = (
# >( opening/starting; beginning of the thread-of-exec (toe?)
# (> opened/started, (finished spawning toe)
# |_<Task: blah blah..> repr of toe, in py these look like <objs>
#
# >) closing/exiting/stopping,
# )> closed/exited/stopped,
# |_<Task: blah blah..>
# [OR <), )< ?? ]
#
# ending = )
# >c) cancelling to close/exit
# c)> cancelled (caused close), OR?
# |_<Actor: ..>
# OR maybe "<c)" which better indicates the cancel being
# "delivered/returned" / returned" to LHS?
#
# >x) erroring to eventuall exit
# x)> errored and terminated
# |_<Actor: ...>
#
# scopes: supers/nurseries, IPC-ctxs, sessions, perms, etc.
# >{ opening
# {> opened
# }> closed
# >} closing
#
# storage: like queues, shm-buffers, files, etc..
# >[ opening
# [> opened
# |_<FileObj: ..>
#
# >] closing
# ]> closed
# IPC ops: channels, transports, msging
# => req msg
# <= resp msg
# <=> 2-way streaming (of msgs)
# <- recv 1 msg
# -> send 1 msg
#
# TODO: still not sure on R/L-HS approach..?
# =>( send-req to exec start (task, actor, thread..)
# (<= recv-req to ^
#
# (<= recv-req ^
# <=( recv-resp opened remote exec primitive
# <=) recv-resp closed
#
# )<=c req to stop due to cancel
# c=>) req to stop due to cancel
#
# =>{ recv-req to open
# <={ send-status that it closed
tree_str: str,
# NOTE: so move back-from-the-left of the `input_op` by
# this amount.
back_from_op: int = 0,
) -> str:
'''
Depth-increment the input (presumably hierarchy/supervision)
input "tree string" below the provided `input_op` execution
operator, so injecting a `"\n|_{input_op}\n"`and indenting the
`tree_str` to nest content aligned with the ops last char.
'''
return (
f'{input_op}\n'
+
textwrap.indent(
tree_str,
prefix=(
len(input_op)
-
(back_from_op + 1)
) * ' ',
)
)
def _trio_main(
actor: Actor,
*,
parent_addr: tuple[str, int] | None = None,
parent_addr: UnwrappedAddress|None = None,
infect_asyncio: bool = False,
) -> None:
@ -214,7 +117,7 @@ def _trio_main(
Entry point for a `trio_run_in_process` subactor.
'''
_debug.hide_runtime_frames()
_frame_stack.hide_runtime_frames()
_state._current_actor = actor
trio_main = partial(
@ -225,30 +128,23 @@ def _trio_main(
if actor.loglevel is not None:
get_console_log(actor.loglevel)
actor_info: str = (
f'|_{actor}\n'
f' uid: {actor.uid}\n'
f' pid: {os.getpid()}\n'
f' parent_addr: {parent_addr}\n'
f' loglevel: {actor.loglevel}\n'
)
log.info(
'Starting new `trio` subactor:\n'
f'Starting `trio` subactor from parent @ '
f'{parent_addr}\n'
+
nest_from_op(
pformat.nest_from_op(
input_op='>(', # see syntax ideas above
tree_str=actor_info,
back_from_op=2, # since "complete"
text=f'{actor}',
)
)
logmeth = log.info
exit_status: str = (
'Subactor exited\n'
+
nest_from_op(
pformat.nest_from_op(
input_op=')>', # like a "closed-to-play"-icon from super perspective
tree_str=actor_info,
back_from_op=1,
text=f'{actor}',
nest_indent=1,
)
)
try:
@ -263,9 +159,9 @@ def _trio_main(
exit_status: str = (
'Actor received KBI (aka an OS-cancel)\n'
+
nest_from_op(
pformat.nest_from_op(
input_op='c)>', # closed due to cancel (see above)
tree_str=actor_info,
text=f'{actor}',
)
)
except BaseException as err:
@ -273,9 +169,9 @@ def _trio_main(
exit_status: str = (
'Main actor task exited due to crash?\n'
+
nest_from_op(
pformat.nest_from_op(
input_op='x)>', # closed by error
tree_str=actor_info,
text=f'{actor}',
)
)
# NOTE since we raise a tb will already be shown on the

View File

@ -23,7 +23,6 @@ import builtins
import importlib
from pprint import pformat
from pdb import bdb
import sys
from types import (
TracebackType,
)
@ -65,15 +64,29 @@ if TYPE_CHECKING:
from ._context import Context
from .log import StackLevelAdapter
from ._stream import MsgStream
from ._ipc import Channel
from .ipc import Channel
log = get_logger('tractor')
_this_mod = importlib.import_module(__name__)
class ActorFailure(Exception):
"General actor failure"
class RuntimeFailure(RuntimeError):
'''
General `Actor`-runtime failure due to,
- a bad runtime-env,
- falied spawning (bad input to process),
- API usage.
'''
class ActorFailure(RuntimeFailure):
'''
`Actor` failed to boot before/after spawn
'''
class InternalError(RuntimeError):
@ -126,6 +139,12 @@ class TrioTaskExited(Exception):
'''
class DebugRequestError(RuntimeError):
'''
Failed to request stdio lock from root actor!
'''
# NOTE: more or less should be close to these:
# 'boxed_type',
# 'src_type',
@ -191,6 +210,8 @@ def get_err_type(type_name: str) -> BaseException|None:
):
return type_ref
return None
def pack_from_raise(
local_err: (
@ -521,7 +542,6 @@ class RemoteActorError(Exception):
if val:
_repr += f'{key}={val_str}{end_char}'
return _repr
def reprol(self) -> str:
@ -600,56 +620,9 @@ class RemoteActorError(Exception):
the type name is already implicitly shown by python).
'''
header: str = ''
body: str = ''
message: str = ''
# XXX when the currently raised exception is this instance,
# we do not ever use the "type header" style repr.
is_being_raised: bool = False
if (
(exc := sys.exception())
and
exc is self
):
is_being_raised: bool = True
with_type_header: bool = (
with_type_header
and
not is_being_raised
)
# <RemoteActorError( .. )> style
if with_type_header:
header: str = f'<{type(self).__name__}('
if message := self._message:
# split off the first line so, if needed, it isn't
# indented the same like the "boxed content" which
# since there is no `.tb_str` is just the `.message`.
lines: list[str] = message.splitlines()
first: str = lines[0]
message: str = message.removeprefix(first)
# with a type-style header we,
# - have no special message "first line" extraction/handling
# - place the message a space in from the header:
# `MsgTypeError( <message> ..`
# ^-here
# - indent the `.message` inside the type body.
if with_type_header:
first = f' {first} )>'
message: str = textwrap.indent(
message,
prefix=' '*2,
)
message: str = first + message
# IFF there is an embedded traceback-str we always
# draw the ascii-box around it.
body: str = ''
if tb_str := self.tb_str:
fields: str = self._mk_fields_str(
_body_fields
@ -670,21 +643,15 @@ class RemoteActorError(Exception):
boxer_header=self.relay_uid,
)
tail = ''
if (
with_type_header
and not message
):
tail: str = '>'
return (
header
+
message
+
f'{body}'
+
tail
# !TODO, it'd be nice to import these top level without
# cycles!
from tractor.devx.pformat import (
pformat_exc,
)
return pformat_exc(
exc=self,
with_type_header=with_type_header,
body=body,
)
__repr__ = pformat
@ -962,7 +929,7 @@ class StreamOverrun(
'''
class TransportClosed(trio.BrokenResourceError):
class TransportClosed(Exception):
'''
IPC transport (protocol) connection was closed or broke and
indicates that the wrapping communication `Channel` can no longer
@ -973,24 +940,39 @@ class TransportClosed(trio.BrokenResourceError):
self,
message: str,
loglevel: str = 'transport',
cause: BaseException|None = None,
src_exc: Exception|None = None,
raise_on_report: bool = False,
) -> None:
self.message: str = message
self._loglevel = loglevel
self._loglevel: str = loglevel
super().__init__(message)
if cause is not None:
self.__cause__ = cause
self._src_exc = src_exc
# set the cause manually if not already set by python
if (
src_exc is not None
and
not self.__cause__
):
self.__cause__ = src_exc
# flag to toggle whether the msg loop should raise
# the exc in its `TransportClosed` handler block.
self._raise_on_report = raise_on_report
@property
def src_exc(self) -> Exception:
return (
self.__cause__
or
self._src_exc
)
def report_n_maybe_raise(
self,
message: str|None = None,
hide_tb: bool = True,
) -> None:
'''
@ -998,9 +980,10 @@ class TransportClosed(trio.BrokenResourceError):
for this error.
'''
__tracebackhide__: bool = hide_tb
message: str = message or self.message
# when a cause is set, slap it onto the log emission.
if cause := self.__cause__:
if cause := self.src_exc:
cause_tb_str: str = ''.join(
traceback.format_tb(cause.__traceback__)
)
@ -1009,13 +992,86 @@ class TransportClosed(trio.BrokenResourceError):
f' {cause}\n' # exc repr
)
getattr(log, self._loglevel)(message)
getattr(
log,
self._loglevel
)(message)
# some errors we want to blow up from
# inside the RPC msg loop
if self._raise_on_report:
raise self from cause
@classmethod
def repr_src_exc(
self,
src_exc: Exception|None = None,
) -> str:
if src_exc is None:
return '<unknown>'
src_msg: tuple[str] = src_exc.args
src_exc_repr: str = (
f'{type(src_exc).__name__}[ {src_msg} ]'
)
return src_exc_repr
def pformat(self) -> str:
from tractor.devx.pformat import (
pformat_exc,
)
return pformat_exc(
exc=self,
)
# delegate to `str`-ified pformat
__repr__ = pformat
@classmethod
def from_src_exc(
cls,
src_exc: (
Exception|
trio.ClosedResource|
trio.BrokenResourceError
),
message: str,
body: str = '',
**init_kws,
) -> TransportClosed:
'''
Convenience constructor for creation from an underlying
`trio`-sourced async-resource/chan/stream error.
Embeds the original `src_exc`'s repr within the
`Exception.args` via a first-line-in-`.message`-put-in-header
pre-processing and allows inserting additional content beyond
the main message via a `body: str`.
'''
repr_src_exc: str = cls.repr_src_exc(
src_exc,
)
next_line: str = f' src_exc: {repr_src_exc}\n'
if body:
body: str = textwrap.indent(
body,
prefix=' '*2,
)
return TransportClosed(
message=(
message
+
next_line
+
body
),
src_exc=src_exc,
**init_kws,
)
class NoResult(RuntimeError):
"No final result is expected for this actor"
@ -1190,55 +1246,6 @@ def unpack_error(
return exc
def is_multi_cancelled(
exc: BaseException|BaseExceptionGroup,
ignore_nested: set[BaseException] = set(),
) -> bool|BaseExceptionGroup:
'''
Predicate to determine if an `BaseExceptionGroup` only contains
some (maybe nested) set of sub-grouped exceptions (like only
`trio.Cancelled`s which get swallowed silently by default) and is
thus the result of "gracefully cancelling" a collection of
sub-tasks (or other conc primitives) and receiving a "cancelled
ACK" from each after termination.
Docs:
----
- https://docs.python.org/3/library/exceptions.html#exception-groups
- https://docs.python.org/3/library/exceptions.html#BaseExceptionGroup.subgroup
'''
if (
not ignore_nested
or
trio.Cancelled in ignore_nested
# XXX always count-in `trio`'s native signal
):
ignore_nested.update({trio.Cancelled})
if isinstance(exc, BaseExceptionGroup):
matched_exc: BaseExceptionGroup|None = exc.subgroup(
tuple(ignore_nested),
# TODO, complain about why not allowed XD
# condition=tuple(ignore_nested),
)
if matched_exc is not None:
return matched_exc
# NOTE, IFF no excs types match (throughout the error-tree)
# -> return `False`, OW return the matched sub-eg.
#
# IOW, for the inverse of ^ for the purpose of
# maybe-enter-REPL--logic: "only debug when the err-tree contains
# at least one exc-type NOT in `ignore_nested`" ; i.e. the case where
# we fallthrough and return `False` here.
return False
def _raise_from_unexpected_msg(
ctx: Context,
msg: MsgType,

View File

@ -1,820 +0,0 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Inter-process comms abstractions
"""
from __future__ import annotations
from collections.abc import (
AsyncGenerator,
AsyncIterator,
)
from contextlib import (
asynccontextmanager as acm,
contextmanager as cm,
)
import platform
from pprint import pformat
import struct
import typing
from typing import (
Any,
Callable,
runtime_checkable,
Protocol,
Type,
TypeVar,
)
import msgspec
from tricycle import BufferedReceiveStream
import trio
from tractor.log import get_logger
from tractor._exceptions import (
MsgTypeError,
pack_from_raise,
TransportClosed,
_mk_send_mte,
_mk_recv_mte,
)
from tractor.msg import (
_ctxvar_MsgCodec,
# _codec, XXX see `self._codec` sanity/debug checks
MsgCodec,
types as msgtypes,
pretty_struct,
)
log = get_logger(__name__)
_is_windows = platform.system() == 'Windows'
def get_stream_addrs(
stream: trio.SocketStream
) -> tuple[
tuple[str, int], # local
tuple[str, int], # remote
]:
'''
Return the `trio` streaming transport prot's socket-addrs for
both the local and remote sides as a pair.
'''
# rn, should both be IP sockets
lsockname = stream.socket.getsockname()
rsockname = stream.socket.getpeername()
return (
tuple(lsockname[:2]),
tuple(rsockname[:2]),
)
# from tractor.msg.types import MsgType
# ?TODO? this should be our `Union[*msgtypes.__spec__]` alias now right..?
# => BLEH, except can't bc prots must inherit typevar or param-spec
# vars..
MsgType = TypeVar('MsgType')
# TODO: break up this mod into a subpkg so we can start adding new
# backends and move this type stuff into a dedicated file.. Bo
#
@runtime_checkable
class MsgTransport(Protocol[MsgType]):
#
# ^-TODO-^ consider using a generic def and indexing with our
# eventual msg definition/types?
# - https://docs.python.org/3/library/typing.html#typing.Protocol
stream: trio.SocketStream
drained: list[MsgType]
def __init__(self, stream: trio.SocketStream) -> None:
...
# XXX: should this instead be called `.sendall()`?
async def send(self, msg: MsgType) -> None:
...
async def recv(self) -> MsgType:
...
def __aiter__(self) -> MsgType:
...
def connected(self) -> bool:
...
# defining this sync otherwise it causes a mypy error because it
# can't figure out it's a generator i guess?..?
def drain(self) -> AsyncIterator[dict]:
...
@property
def laddr(self) -> tuple[str, int]:
...
@property
def raddr(self) -> tuple[str, int]:
...
# TODO: typing oddity.. not sure why we have to inherit here, but it
# seems to be an issue with `get_msg_transport()` returning
# a `Type[Protocol]`; probably should make a `mypy` issue?
class MsgpackTCPStream(MsgTransport):
'''
A ``trio.SocketStream`` delivering ``msgpack`` formatted data
using the ``msgspec`` codec lib.
'''
layer_key: int = 4
name_key: str = 'tcp'
# TODO: better naming for this?
# -[ ] check how libp2p does naming for such things?
codec_key: str = 'msgpack'
def __init__(
self,
stream: trio.SocketStream,
prefix_size: int = 4,
# XXX optionally provided codec pair for `msgspec`:
# https://jcristharif.com/msgspec/extending.html#mapping-to-from-native-types
#
# TODO: define this as a `Codec` struct which can be
# overriden dynamically by the application/runtime?
codec: tuple[
Callable[[Any], Any]|None, # coder
Callable[[type, Any], Any]|None, # decoder
]|None = None,
) -> None:
self.stream = stream
assert self.stream.socket
# should both be IP sockets
self._laddr, self._raddr = get_stream_addrs(stream)
# create read loop instance
self._aiter_pkts = self._iter_packets()
self._send_lock = trio.StrictFIFOLock()
# public i guess?
self.drained: list[dict] = []
self.recv_stream = BufferedReceiveStream(
transport_stream=stream
)
self.prefix_size = prefix_size
# allow for custom IPC msg interchange format
# dynamic override Bo
self._task = trio.lowlevel.current_task()
# XXX for ctxvar debug only!
# self._codec: MsgCodec = (
# codec
# or
# _codec._ctxvar_MsgCodec.get()
# )
async def _iter_packets(self) -> AsyncGenerator[dict, None]:
'''
Yield `bytes`-blob decoded packets from the underlying TCP
stream using the current task's `MsgCodec`.
This is a streaming routine implemented as an async generator
func (which was the original design, but could be changed?)
and is allocated by a `.__call__()` inside `.__init__()` where
it is assigned to the `._aiter_pkts` attr.
'''
decodes_failed: int = 0
while True:
try:
header: bytes = await self.recv_stream.receive_exactly(4)
except (
ValueError,
ConnectionResetError,
# not sure entirely why we need this but without it we
# seem to be getting racy failures here on
# arbiter/registry name subs..
trio.BrokenResourceError,
) as trans_err:
loglevel = 'transport'
match trans_err:
# case (
# ConnectionResetError()
# ):
# loglevel = 'transport'
# peer actor (graceful??) TCP EOF but `tricycle`
# seems to raise a 0-bytes-read?
case ValueError() if (
'unclean EOF' in trans_err.args[0]
):
pass
# peer actor (task) prolly shutdown quickly due
# to cancellation
case trio.BrokenResourceError() if (
'Connection reset by peer' in trans_err.args[0]
):
pass
# unless the disconnect condition falls under "a
# normal operation breakage" we usualy console warn
# about it.
case _:
loglevel: str = 'warning'
raise TransportClosed(
message=(
f'IPC transport already closed by peer\n'
f'x]> {type(trans_err)}\n'
f' |_{self}\n'
),
loglevel=loglevel,
) from trans_err
# XXX definitely can happen if transport is closed
# manually by another `trio.lowlevel.Task` in the
# same actor; we use this in some simulated fault
# testing for ex, but generally should never happen
# under normal operation!
#
# NOTE: as such we always re-raise this error from the
# RPC msg loop!
except trio.ClosedResourceError as closure_err:
raise TransportClosed(
message=(
f'IPC transport already manually closed locally?\n'
f'x]> {type(closure_err)} \n'
f' |_{self}\n'
),
loglevel='error',
raise_on_report=(
closure_err.args[0] == 'another task closed this fd'
or
closure_err.args[0] in ['another task closed this fd']
),
) from closure_err
# graceful TCP EOF disconnect
if header == b'':
raise TransportClosed(
message=(
f'IPC transport already gracefully closed\n'
f']>\n'
f' |_{self}\n'
),
loglevel='transport',
# cause=??? # handy or no?
)
size: int
size, = struct.unpack("<I", header)
log.transport(f'received header {size}') # type: ignore
msg_bytes: bytes = await self.recv_stream.receive_exactly(size)
log.transport(f"received {msg_bytes}") # type: ignore
try:
# NOTE: lookup the `trio.Task.context`'s var for
# the current `MsgCodec`.
codec: MsgCodec = _ctxvar_MsgCodec.get()
# XXX for ctxvar debug only!
# if self._codec.pld_spec != codec.pld_spec:
# assert (
# task := trio.lowlevel.current_task()
# ) is not self._task
# self._task = task
# self._codec = codec
# log.runtime(
# f'Using new codec in {self}.recv()\n'
# f'codec: {self._codec}\n\n'
# f'msg_bytes: {msg_bytes}\n'
# )
yield codec.decode(msg_bytes)
# XXX NOTE: since the below error derives from
# `DecodeError` we need to catch is specially
# and always raise such that spec violations
# are never allowed to be caught silently!
except msgspec.ValidationError as verr:
msgtyperr: MsgTypeError = _mk_recv_mte(
msg=msg_bytes,
codec=codec,
src_validation_error=verr,
)
# XXX deliver up to `Channel.recv()` where
# a re-raise and `Error`-pack can inject the far
# end actor `.uid`.
yield msgtyperr
except (
msgspec.DecodeError,
UnicodeDecodeError,
):
if decodes_failed < 4:
# ignore decoding errors for now and assume they have to
# do with a channel drop - hope that receiving from the
# channel will raise an expected error and bubble up.
try:
msg_str: str|bytes = msg_bytes.decode()
except UnicodeDecodeError:
msg_str = msg_bytes
log.exception(
'Failed to decode msg?\n'
f'{codec}\n\n'
'Rxed bytes from wire:\n\n'
f'{msg_str!r}\n'
)
decodes_failed += 1
else:
raise
async def send(
self,
msg: msgtypes.MsgType,
strict_types: bool = True,
hide_tb: bool = False,
) -> None:
'''
Send a msgpack encoded py-object-blob-as-msg over TCP.
If `strict_types == True` then a `MsgTypeError` will be raised on any
invalid msg type
'''
__tracebackhide__: bool = hide_tb
# XXX see `trio._sync.AsyncContextManagerMixin` for details
# on the `.acquire()`/`.release()` sequencing..
async with self._send_lock:
# NOTE: lookup the `trio.Task.context`'s var for
# the current `MsgCodec`.
codec: MsgCodec = _ctxvar_MsgCodec.get()
# XXX for ctxvar debug only!
# if self._codec.pld_spec != codec.pld_spec:
# self._codec = codec
# log.runtime(
# f'Using new codec in {self}.send()\n'
# f'codec: {self._codec}\n\n'
# f'msg: {msg}\n'
# )
if type(msg) not in msgtypes.__msg_types__:
if strict_types:
raise _mk_send_mte(
msg,
codec=codec,
)
else:
log.warning(
'Sending non-`Msg`-spec msg?\n\n'
f'{msg}\n'
)
try:
bytes_data: bytes = codec.encode(msg)
except TypeError as _err:
typerr = _err
msgtyperr: MsgTypeError = _mk_send_mte(
msg,
codec=codec,
message=(
f'IPC-msg-spec violation in\n\n'
f'{pretty_struct.Struct.pformat(msg)}'
),
src_type_error=typerr,
)
raise msgtyperr from typerr
# supposedly the fastest says,
# https://stackoverflow.com/a/54027962
size: bytes = struct.pack("<I", len(bytes_data))
return await self.stream.send_all(size + bytes_data)
# ?TODO? does it help ever to dynamically show this
# frame?
# try:
# <the-above_code>
# except BaseException as _err:
# err = _err
# if not isinstance(err, MsgTypeError):
# __tracebackhide__: bool = False
# raise
@property
def laddr(self) -> tuple[str, int]:
return self._laddr
@property
def raddr(self) -> tuple[str, int]:
return self._raddr
async def recv(self) -> Any:
return await self._aiter_pkts.asend(None)
async def drain(self) -> AsyncIterator[dict]:
'''
Drain the stream's remaining messages sent from
the far end until the connection is closed by
the peer.
'''
try:
async for msg in self._iter_packets():
self.drained.append(msg)
except TransportClosed:
for msg in self.drained:
yield msg
def __aiter__(self):
return self._aiter_pkts
def connected(self) -> bool:
return self.stream.socket.fileno() != -1
def get_msg_transport(
key: tuple[str, str],
) -> Type[MsgTransport]:
return {
('msgpack', 'tcp'): MsgpackTCPStream,
}[key]
class Channel:
'''
An inter-process channel for communication between (remote) actors.
Wraps a ``MsgStream``: transport + encoding IPC connection.
Currently we only support ``trio.SocketStream`` for transport
(aka TCP) and the ``msgpack`` interchange format via the ``msgspec``
codec libary.
'''
def __init__(
self,
destaddr: tuple[str, int]|None,
msg_transport_type_key: tuple[str, str] = ('msgpack', 'tcp'),
# TODO: optional reconnection support?
# auto_reconnect: bool = False,
# on_reconnect: typing.Callable[..., typing.Awaitable] = None,
) -> None:
# self._recon_seq = on_reconnect
# self._autorecon = auto_reconnect
self._destaddr = destaddr
self._transport_key = msg_transport_type_key
# Either created in ``.connect()`` or passed in by
# user in ``.from_stream()``.
self._stream: trio.SocketStream|None = None
self._transport: MsgTransport|None = None
# set after handshake - always uid of far end
self.uid: tuple[str, str]|None = None
self._aiter_msgs = self._iter_msgs()
self._exc: Exception|None = None # set if far end actor errors
self._closed: bool = False
# flag set by ``Portal.cancel_actor()`` indicating remote
# (possibly peer) cancellation of the far end actor
# runtime.
self._cancel_called: bool = False
@property
def msgstream(self) -> MsgTransport:
log.info(
'`Channel.msgstream` is an old name, use `._transport`'
)
return self._transport
@property
def transport(self) -> MsgTransport:
return self._transport
@classmethod
def from_stream(
cls,
stream: trio.SocketStream,
**kwargs,
) -> Channel:
src, dst = get_stream_addrs(stream)
chan = Channel(
destaddr=dst,
**kwargs,
)
# set immediately here from provided instance
chan._stream: trio.SocketStream = stream
chan.set_msg_transport(stream)
return chan
def set_msg_transport(
self,
stream: trio.SocketStream,
type_key: tuple[str, str]|None = None,
# XXX optionally provided codec pair for `msgspec`:
# https://jcristharif.com/msgspec/extending.html#mapping-to-from-native-types
codec: MsgCodec|None = None,
) -> MsgTransport:
type_key = (
type_key
or
self._transport_key
)
# get transport type, then
self._transport = get_msg_transport(
type_key
# instantiate an instance of the msg-transport
)(
stream,
codec=codec,
)
return self._transport
@cm
def apply_codec(
self,
codec: MsgCodec,
) -> None:
'''
Temporarily override the underlying IPC msg codec for
dynamic enforcement of messaging schema.
'''
orig: MsgCodec = self._transport.codec
try:
self._transport.codec = codec
yield
finally:
self._transport.codec = orig
# TODO: do a .src/.dst: str for maddrs?
def __repr__(self) -> str:
if not self._transport:
return '<Channel with inactive transport?>'
return repr(
self._transport.stream.socket._sock
).replace( # type: ignore
"socket.socket",
"Channel",
)
@property
def laddr(self) -> tuple[str, int]|None:
return self._transport.laddr if self._transport else None
@property
def raddr(self) -> tuple[str, int]|None:
return self._transport.raddr if self._transport else None
async def connect(
self,
destaddr: tuple[Any, ...] | None = None,
**kwargs
) -> MsgTransport:
if self.connected():
raise RuntimeError("channel is already connected?")
destaddr = destaddr or self._destaddr
assert isinstance(destaddr, tuple)
stream = await trio.open_tcp_stream(
*destaddr,
**kwargs
)
transport = self.set_msg_transport(stream)
log.transport(
f'Opened channel[{type(transport)}]: {self.laddr} -> {self.raddr}'
)
return transport
# TODO: something like,
# `pdbp.hideframe_on(errors=[MsgTypeError])`
# instead of the `try/except` hack we have rn..
# seems like a pretty useful thing to have in general
# along with being able to filter certain stack frame(s / sets)
# possibly based on the current log-level?
async def send(
self,
payload: Any,
hide_tb: bool = False,
) -> None:
'''
Send a coded msg-blob over the transport.
'''
__tracebackhide__: bool = hide_tb
try:
log.transport(
'=> send IPC msg:\n\n'
f'{pformat(payload)}\n'
)
# assert self._transport # but why typing?
await self._transport.send(
payload,
hide_tb=hide_tb,
)
except BaseException as _err:
err = _err # bind for introspection
if not isinstance(_err, MsgTypeError):
# assert err
__tracebackhide__: bool = False
else:
assert err.cid
raise
async def recv(self) -> Any:
assert self._transport
return await self._transport.recv()
# TODO: auto-reconnect features like 0mq/nanomsg?
# -[ ] implement it manually with nods to SC prot
# possibly on multiple transport backends?
# -> seems like that might be re-inventing scalability
# prots tho no?
# try:
# return await self._transport.recv()
# except trio.BrokenResourceError:
# if self._autorecon:
# await self._reconnect()
# return await self.recv()
# raise
async def aclose(self) -> None:
log.transport(
f'Closing channel to {self.uid} '
f'{self.laddr} -> {self.raddr}'
)
assert self._transport
await self._transport.stream.aclose()
self._closed = True
async def __aenter__(self):
await self.connect()
return self
async def __aexit__(self, *args):
await self.aclose(*args)
def __aiter__(self):
return self._aiter_msgs
# ?TODO? run any reconnection sequence?
# -[ ] prolly should be impl-ed as deco-API?
#
# async def _reconnect(self) -> None:
# """Handle connection failures by polling until a reconnect can be
# established.
# """
# down = False
# while True:
# try:
# with trio.move_on_after(3) as cancel_scope:
# await self.connect()
# cancelled = cancel_scope.cancelled_caught
# if cancelled:
# log.transport(
# "Reconnect timed out after 3 seconds, retrying...")
# continue
# else:
# log.transport("Stream connection re-established!")
# # on_recon = self._recon_seq
# # if on_recon:
# # await on_recon(self)
# break
# except (OSError, ConnectionRefusedError):
# if not down:
# down = True
# log.transport(
# f"Connection to {self.raddr} went down, waiting"
# " for re-establishment")
# await trio.sleep(1)
async def _iter_msgs(
self
) -> AsyncGenerator[Any, None]:
'''
Yield `MsgType` IPC msgs decoded and deliverd from
an underlying `MsgTransport` protocol.
This is a streaming routine alo implemented as an async-gen
func (same a `MsgTransport._iter_pkts()`) gets allocated by
a `.__call__()` inside `.__init__()` where it is assigned to
the `._aiter_msgs` attr.
'''
assert self._transport
while True:
try:
async for msg in self._transport:
match msg:
# NOTE: if transport/interchange delivers
# a type error, we pack it with the far
# end peer `Actor.uid` and relay the
# `Error`-msg upward to the `._rpc` stack
# for normal RAE handling.
case MsgTypeError():
yield pack_from_raise(
local_err=msg,
cid=msg.cid,
# XXX we pack it here bc lower
# layers have no notion of an
# actor-id ;)
src_uid=self.uid,
)
case _:
yield msg
except trio.BrokenResourceError:
# if not self._autorecon:
raise
await self.aclose()
# if self._autorecon: # attempt reconnect
# await self._reconnect()
# continue
def connected(self) -> bool:
return self._transport.connected() if self._transport else False
@acm
async def _connect_chan(
host: str,
port: int
) -> typing.AsyncGenerator[Channel, None]:
'''
Create and connect a channel with disconnect on context manager
teardown.
'''
chan = Channel((host, port))
await chan.connect()
yield chan
with trio.CancelScope(shield=True):
await chan.aclose()

View File

@ -39,11 +39,14 @@ import warnings
import trio
from .trionics import maybe_open_nursery
from .trionics import (
maybe_open_nursery,
collapse_eg,
)
from ._state import (
current_actor,
)
from ._ipc import Channel
from .ipc import Channel
from .log import get_logger
from .msg import (
# Error,
@ -52,8 +55,8 @@ from .msg import (
Return,
)
from ._exceptions import (
# unpack_error,
NoResult,
TransportClosed,
)
from ._context import (
Context,
@ -107,10 +110,18 @@ class Portal:
# point.
self._expect_result_ctx: Context|None = None
self._streams: set[MsgStream] = set()
# TODO, this should be PRIVATE (and never used publicly)! since it's just
# a cached ref to the local runtime instead of calling
# `current_actor()` everywhere.. XD
self.actor: Actor = current_actor()
@property
def chan(self) -> Channel:
'''
Ref to this ctx's underlying `tractor.ipc.Channel`.
'''
return self._chan
@property
@ -170,10 +181,17 @@ class Portal:
# not expecting a "main" result
if self._expect_result_ctx is None:
peer_id: str = f'{self.channel.aid.reprol()!r}'
log.warning(
f"Portal for {self.channel.uid} not expecting a final"
" result?\nresult() should only be called if subactor"
" was spawned with `ActorNursery.run_in_actor()`")
f'Portal to peer {peer_id} will not deliver a final result?\n'
f'\n'
f'Context.result() can only be called by the parent of '
f'a sub-actor when it was spawned with '
f'`ActorNursery.run_in_actor()`'
f'\n'
f'Further this `ActorNursery`-method-API will deprecated in the'
f'near fututre!\n'
)
return NoResult
# expecting a "main" result
@ -206,6 +224,7 @@ class Portal:
typname: str = type(self).__name__
log.warning(
f'`{typname}.result()` is DEPRECATED!\n'
f'\n'
f'Use `{typname}.wait_for_result()` instead!\n'
)
return await self.wait_for_result(
@ -217,8 +236,10 @@ class Portal:
# terminate all locally running async generator
# IPC calls
if self._streams:
log.cancel(
f"Cancelling all streams with {self.channel.uid}")
peer_id: str = f'{self.channel.aid.reprol()!r}'
report: str = (
f'Cancelling all msg-streams with {peer_id}\n'
)
for stream in self._streams.copy():
try:
await stream.aclose()
@ -227,10 +248,18 @@ class Portal:
# (unless of course at some point down the road we
# won't expect this to always be the case or need to
# detect it for respawning purposes?)
log.debug(f"{stream} was already closed.")
report += (
f'->) {stream!r} already closed\n'
)
log.cancel(report)
async def aclose(self):
log.debug(f"Closing {self}")
log.debug(
f'Closing portal\n'
f'>}}\n'
f'|_{self}\n'
)
# TODO: once we move to implementing our own `ReceiveChannel`
# (including remote task cancellation inside its `.aclose()`)
# we'll need to .aclose all those channels here
@ -256,23 +285,22 @@ class Portal:
__runtimeframe__: int = 1 # noqa
chan: Channel = self.channel
peer_id: str = f'{self.channel.aid.reprol()!r}'
if not chan.connected():
log.runtime(
'This channel is already closed, skipping cancel request..'
'Peer {peer_id} is already disconnected\n'
'-> skipping cancel request..\n'
)
return False
reminfo: str = (
f'c)=> {self.channel.uid}\n'
f' |_{chan}\n'
)
log.cancel(
f'Requesting actor-runtime cancel for peer\n\n'
f'{reminfo}'
f'Sending actor-runtime-cancel-req to peer\n'
f'\n'
f'c)=> {peer_id}\n'
)
# XXX the one spot we set it?
self.channel._cancel_called: bool = True
chan._cancel_called: bool = True
try:
# send cancel cmd - might not get response
# XXX: sure would be nice to make this work with
@ -293,22 +321,43 @@ class Portal:
# may timeout and we never get an ack (obvi racy)
# but that doesn't mean it wasn't cancelled.
log.debug(
'May have failed to cancel peer?\n'
f'{reminfo}'
f'May have failed to cancel peer?\n'
f'\n'
f'c)=?> {peer_id}\n'
)
# if we get here some weird cancellation case happened
return False
except (
# XXX, should never really get raised unless we aren't
# wrapping them in the below type by mistake?
#
# Leaving the catch here for now until we're very sure
# all the cases (for various tpt protos) have indeed been
# re-wrapped ;p
trio.ClosedResourceError,
trio.BrokenResourceError,
):
log.debug(
'IPC chan for actor already closed or broken?\n\n'
f'{self.channel.uid}\n'
f' |_{self.channel}\n'
TransportClosed,
) as tpt_err:
ipc_borked_report: str = (
f'IPC for actor already closed/broken?\n\n'
f'\n'
f'c)=x> {peer_id}\n'
)
match tpt_err:
case TransportClosed():
log.debug(ipc_borked_report)
case _:
ipc_borked_report += (
f'\n'
f'Unhandled low-level transport-closed/error during\n'
f'Portal.cancel_actor()` request?\n'
f'<{type(tpt_err).__name__}( {tpt_err} )>\n'
)
log.warning(ipc_borked_report)
return False
# TODO: do we still need this for low level `Actor`-runtime
@ -464,10 +513,13 @@ class Portal:
with trio.CancelScope(shield=True):
await ctx.cancel()
except trio.ClosedResourceError:
except trio.ClosedResourceError as cre:
# if the far end terminates before we send a cancel the
# underlying transport-channel may already be closed.
log.cancel(f'Context {ctx} was already closed?')
log.cancel(
f'Context.cancel() -> {cre!r}\n'
f'cid: {ctx.cid!r} already closed?\n'
)
# XXX: should this always be done?
# await recv_chan.aclose()
@ -504,8 +556,12 @@ class LocalPortal:
return it's result.
'''
obj = self.actor if ns == 'self' else importlib.import_module(ns)
func = getattr(obj, func_name)
obj = (
self.actor
if ns == 'self'
else importlib.import_module(ns)
)
func: Callable = getattr(obj, func_name)
return await func(**kwargs)
@ -530,30 +586,30 @@ async def open_portal(
assert actor
was_connected: bool = False
async with maybe_open_nursery(
tn,
shield=shield,
strict_exception_groups=False,
# ^XXX^ TODO? soo roll our own then ??
# -> since we kinda want the "if only one `.exception` then
# just raise that" interface?
) as tn:
async with (
collapse_eg(),
maybe_open_nursery(
tn,
shield=shield,
) as tn,
):
if not channel.connected():
await channel.connect()
was_connected = True
if channel.uid is None:
await actor._do_handshake(channel)
if channel.aid is None:
await channel._do_handshake(
aid=actor.aid,
)
msg_loop_cs: trio.CancelScope|None = None
if start_msg_loop:
from ._runtime import process_messages
from . import _rpc
msg_loop_cs = await tn.start(
partial(
process_messages,
actor,
channel,
_rpc.process_messages,
chan=channel,
# if the local task is cancelled we want to keep
# the msg loop running until our block ends
shield=True,

View File

@ -18,7 +18,9 @@
Root actor runtime ignition(s).
'''
from contextlib import asynccontextmanager as acm
from contextlib import (
asynccontextmanager as acm,
)
from functools import partial
import importlib
import inspect
@ -26,96 +28,55 @@ import logging
import os
import signal
import sys
from typing import Callable
from typing import (
Any,
Callable,
)
import warnings
import trio
from ._runtime import (
Actor,
Arbiter,
# TODO: rename and make a non-actor subtype?
# Arbiter as Registry,
async_main,
from . import _runtime
from .devx import (
debug,
_frame_stack,
pformat as _pformat,
)
from .devx import _debug
from . import _spawn
from . import _state
from . import log
from ._ipc import _connect_chan
from ._exceptions import is_multi_cancelled
# set at startup and after forks
_default_host: str = '127.0.0.1'
_default_port: int = 1616
# default registry always on localhost
_default_lo_addrs: list[tuple[str, int]] = [(
_default_host,
_default_port,
)]
from .ipc import (
_connect_chan,
)
from ._addr import (
Address,
UnwrappedAddress,
default_lo_addrs,
mk_uuid,
wrap_address,
)
from .trionics import (
is_multi_cancelled,
collapse_eg,
)
from ._exceptions import (
RuntimeFailure,
)
logger = log.get_logger('tractor')
# TODO: stick this in a `@acm` defined in `devx.debug`?
# -[ ] also maybe consider making this a `wrapt`-deco to
# save an indent level?
#
@acm
async def open_root_actor(
*,
# defaults are above
registry_addrs: list[tuple[str, int]]|None = None,
# defaults are above
arbiter_addr: tuple[str, int]|None = None,
name: str|None = 'root',
# either the `multiprocessing` start method:
# https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
# OR `trio` (the new default).
start_method: _spawn.SpawnMethodKey|None = None,
# enables the multi-process debugger support
debug_mode: bool = False,
maybe_enable_greenback: bool = True, # `.pause_from_sync()/breakpoint()` support
enable_stack_on_sig: bool = False,
# internal logging
loglevel: str|None = None,
enable_modules: list|None = None,
rpc_module_paths: list|None = None,
# NOTE: allow caller to ensure that only one registry exists
# and that this call creates it.
ensure_registry: bool = False,
hide_tb: bool = True,
# XXX, proxied directly to `.devx._debug._maybe_enter_pm()`
# for REPL-entry logic.
debug_filter: Callable[
[BaseException|BaseExceptionGroup],
bool,
] = lambda err: not is_multi_cancelled(err),
# TODO, a way for actors to augment passing derived
# read-only state to sublayers?
# extra_rt_vars: dict|None = None,
) -> Actor:
'''
Runtime init entry point for ``tractor``.
'''
_debug.hide_runtime_frames()
__tracebackhide__: bool = hide_tb
# TODO: stick this in a `@cm` defined in `devx._debug`?
#
async def maybe_block_bp(
debug_mode: bool,
maybe_enable_greenback: bool,
) -> bool:
# Override the global debugger hook to make it play nice with
# ``trio``, see much discussion in:
# https://github.com/python-trio/trio/issues/1155#issuecomment-742964018
@ -124,23 +85,25 @@ async def open_root_actor(
'PYTHONBREAKPOINT',
None,
)
bp_blocked: bool
if (
debug_mode
and maybe_enable_greenback
and (
maybe_mod := await _debug.maybe_init_greenback(
maybe_mod := await debug.maybe_init_greenback(
raise_not_found=False,
)
)
):
logger.info(
f'Found `greenback` installed @ {maybe_mod}\n'
'Enabling `tractor.pause_from_sync()` support!\n'
f'Enabling `tractor.pause_from_sync()` support!\n'
)
os.environ['PYTHONBREAKPOINT'] = (
'tractor.devx._debug._sync_pause_from_builtin'
'tractor.devx.debug._sync_pause_from_builtin'
)
_state._runtime_vars['use_greenback'] = True
bp_blocked = False
else:
# TODO: disable `breakpoint()` by default (without
@ -159,302 +122,481 @@ async def open_root_actor(
# lol ok,
# https://docs.python.org/3/library/sys.html#sys.breakpointhook
os.environ['PYTHONBREAKPOINT'] = "0"
bp_blocked = True
# attempt to retreive ``trio``'s sigint handler and stash it
# on our debugger lock state.
_debug.DebugStatus._trio_handler = signal.getsignal(signal.SIGINT)
# mark top most level process as root actor
_state._runtime_vars['_is_root'] = True
# caps based rpc list
enable_modules = (
enable_modules
or
[]
)
if rpc_module_paths:
warnings.warn(
"`rpc_module_paths` is now deprecated, use "
" `enable_modules` instead.",
DeprecationWarning,
stacklevel=2,
)
enable_modules.extend(rpc_module_paths)
if start_method is not None:
_spawn.try_set_start_method(start_method)
if arbiter_addr is not None:
warnings.warn(
'`arbiter_addr` is now deprecated\n'
'Use `registry_addrs: list[tuple]` instead..',
DeprecationWarning,
stacklevel=2,
)
registry_addrs = [arbiter_addr]
registry_addrs: list[tuple[str, int]] = (
registry_addrs
or
_default_lo_addrs
)
assert registry_addrs
loglevel = (
loglevel
or log._default_loglevel
).upper()
if (
debug_mode
and _spawn._spawn_method == 'trio'
):
_state._runtime_vars['_debug_mode'] = True
# expose internal debug module to every actor allowing for
# use of ``await tractor.pause()``
enable_modules.append('tractor.devx._debug')
# if debug mode get's enabled *at least* use that level of
# logging for some informative console prompts.
if (
logging.getLevelName(
# lul, need the upper case for the -> int map?
# sweet "dynamic function behaviour" stdlib...
loglevel,
) > logging.getLevelName('PDB')
):
loglevel = 'PDB'
elif debug_mode:
raise RuntimeError(
"Debug mode is only supported for the `trio` backend!"
)
assert loglevel
_log = log.get_console_log(loglevel)
assert _log
# TODO: factor this into `.devx._stackscope`!!
if (
debug_mode
and
enable_stack_on_sig
):
from .devx._stackscope import enable_stack_on_sig
enable_stack_on_sig()
# closed into below ping task-func
ponged_addrs: list[tuple[str, int]] = []
async def ping_tpt_socket(
addr: tuple[str, int],
timeout: float = 1,
) -> None:
'''
Attempt temporary connection to see if a registry is
listening at the requested address by a tranport layer
ping.
If a connection can't be made quickly we assume none no
server is listening at that addr.
'''
try:
# TODO: this connect-and-bail forces us to have to
# carefully rewrap TCP 104-connection-reset errors as
# EOF so as to avoid propagating cancel-causing errors
# to the channel-msg loop machinery. Likely it would
# be better to eventually have a "discovery" protocol
# with basic handshake instead?
with trio.move_on_after(timeout):
async with _connect_chan(*addr):
ponged_addrs.append(addr)
except OSError:
# TODO: make this a "discovery" log level?
logger.info(
f'No actor registry found @ {addr}\n'
)
async with trio.open_nursery() as tn:
for addr in registry_addrs:
tn.start_soon(
ping_tpt_socket,
tuple(addr), # TODO: just drop this requirement?
)
trans_bind_addrs: list[tuple[str, int]] = []
# Create a new local root-actor instance which IS NOT THE
# REGISTRAR
if ponged_addrs:
if ensure_registry:
raise RuntimeError(
f'Failed to open `{name}`@{ponged_addrs}: '
'registry socket(s) already bound'
)
# we were able to connect to an arbiter
logger.info(
f'Registry(s) seem(s) to exist @ {ponged_addrs}'
)
actor = Actor(
name=name or 'anonymous',
registry_addrs=ponged_addrs,
loglevel=loglevel,
enable_modules=enable_modules,
)
# DO NOT use the registry_addrs as the transport server
# addrs for this new non-registar, root-actor.
for host, port in ponged_addrs:
# NOTE: zero triggers dynamic OS port allocation
trans_bind_addrs.append((host, 0))
# Start this local actor as the "registrar", aka a regular
# actor who manages the local registry of "mailboxes" of
# other process-tree-local sub-actors.
else:
# NOTE that if the current actor IS THE REGISTAR, the
# following init steps are taken:
# - the tranport layer server is bound to each (host, port)
# pair defined in provided registry_addrs, or the default.
trans_bind_addrs = registry_addrs
# - it is normally desirable for any registrar to stay up
# indefinitely until either all registered (child/sub)
# actors are terminated (via SC supervision) or,
# a re-election process has taken place.
# NOTE: all of ^ which is not implemented yet - see:
# https://github.com/goodboy/tractor/issues/216
# https://github.com/goodboy/tractor/pull/348
# https://github.com/goodboy/tractor/issues/296
actor = Arbiter(
name or 'registrar',
registry_addrs=registry_addrs,
loglevel=loglevel,
enable_modules=enable_modules,
)
# XXX, in case the root actor runtime was actually run from
# `tractor.to_asyncio.run_as_asyncio_guest()` and NOt
# `.trio.run()`.
actor._infected_aio = _state._runtime_vars['_is_infected_aio']
# Start up main task set via core actor-runtime nurseries.
try:
# assign process-local actor
_state._current_actor = actor
# start local channel-server and fake the portal API
# NOTE: this won't block since we provide the nursery
ml_addrs_str: str = '\n'.join(
f'@{addr}' for addr in trans_bind_addrs
)
logger.info(
f'Starting local {actor.uid} on the following transport addrs:\n'
f'{ml_addrs_str}'
)
# start the actor runtime in a new task
async with trio.open_nursery(
strict_exception_groups=False,
# ^XXX^ TODO? instead unpack any RAE as per "loose" style?
) as nursery:
# ``_runtime.async_main()`` creates an internal nursery
# and blocks here until any underlying actor(-process)
# tree has terminated thereby conducting so called
# "end-to-end" structured concurrency throughout an
# entire hierarchical python sub-process set; all
# "actor runtime" primitives are SC-compat and thus all
# transitively spawned actors/processes must be as
# well.
await nursery.start(
partial(
async_main,
actor,
accept_addrs=trans_bind_addrs,
parent_addr=None
)
)
try:
yield actor
except (
Exception,
BaseExceptionGroup,
) as err:
# TODO, in beginning to handle the subsubactor with
# crashed grandparent cases..
#
# was_locked: bool = await _debug.maybe_wait_for_debugger(
# child_in_debug=True,
# )
# XXX NOTE XXX see equiv note inside
# `._runtime.Actor._stream_handler()` where in the
# non-root or root-that-opened-this-mahually case we
# wait for the local actor-nursery to exit before
# exiting the transport channel handler.
entered: bool = await _debug._maybe_enter_pm(
err,
api_frame=inspect.currentframe(),
debug_filter=debug_filter,
)
if (
not entered
and
not is_multi_cancelled(
err,
)
):
logger.exception('Root actor crashed\n')
# ALWAYS re-raise any error bubbled up from the
# runtime!
raise
finally:
# NOTE: not sure if we'll ever need this but it's
# possibly better for even more determinism?
# logger.cancel(
# f'Waiting on {len(nurseries)} nurseries in root..')
# nurseries = actor._actoruid2nursery.values()
# async with trio.open_nursery() as tempn:
# for an in nurseries:
# tempn.start_soon(an.exited.wait)
logger.info(
'Closing down root actor'
)
await actor.cancel(None) # self cancel
yield bp_blocked
finally:
_state._current_actor = None
_state._last_actor_terminated = actor
# restore any prior built-in `breakpoint()` hook state
if builtin_bp_handler is not None:
sys.breakpointhook = builtin_bp_handler
if orig_bp_path is not None:
os.environ['PYTHONBREAKPOINT'] = orig_bp_path
else:
# clear env back to having no entry
os.environ.pop('PYTHONBREAKPOINT', None)
@acm
async def open_root_actor(
*,
# defaults are above
registry_addrs: list[UnwrappedAddress]|None = None,
# defaults are above
arbiter_addr: tuple[UnwrappedAddress]|None = None,
enable_transports: list[
# TODO, this should eventually be the pairs as
# defined by (codec, proto) as on `MsgTransport.
_state.TransportProtocolKey,
]|None = None,
name: str|None = 'root',
# either the `multiprocessing` start method:
# https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
# OR `trio` (the new default).
start_method: _spawn.SpawnMethodKey|None = None,
# enables the multi-process debugger support
debug_mode: bool = False,
maybe_enable_greenback: bool = False, # `.pause_from_sync()/breakpoint()` support
# ^XXX NOTE^ the perf implications of use,
# https://greenback.readthedocs.io/en/latest/principle.html#performance
enable_stack_on_sig: bool = False,
# internal logging
loglevel: str|None = None,
enable_modules: list|None = None,
rpc_module_paths: list|None = None,
# NOTE: allow caller to ensure that only one registry exists
# and that this call creates it.
ensure_registry: bool = False,
hide_tb: bool = True,
# XXX, proxied directly to `.devx.debug._maybe_enter_pm()`
# for REPL-entry logic.
debug_filter: Callable[
[BaseException|BaseExceptionGroup],
bool,
] = lambda err: not is_multi_cancelled(err),
# TODO, a way for actors to augment passing derived
# read-only state to sublayers?
# extra_rt_vars: dict|None = None,
) -> _runtime.Actor:
'''
Initialize the `tractor` runtime by starting a "root actor" in
a parent-most Python process.
All (disjoint) actor-process-trees-as-programs are created via
this entrypoint.
'''
# XXX NEVER allow nested actor-trees!
if already_actor := _state.current_actor(
err_on_no_runtime=False,
):
rtvs: dict[str, Any] = _state._runtime_vars
root_mailbox: list[str, int] = rtvs['_root_mailbox']
registry_addrs: list[list[str, int]] = rtvs['_registry_addrs']
raise RuntimeFailure(
f'A current actor already exists !?\n'
f'({already_actor}\n'
f'\n'
f'You can NOT open a second root actor from within '
f'an existing tree and the current root of this '
f'already exists !!\n'
f'\n'
f'_root_mailbox: {root_mailbox!r}\n'
f'_registry_addrs: {registry_addrs!r}\n'
)
async with maybe_block_bp(
debug_mode=debug_mode,
maybe_enable_greenback=maybe_enable_greenback,
):
if enable_transports is None:
enable_transports: list[str] = _state.current_ipc_protos()
else:
_state._runtime_vars['_enable_tpts'] = enable_transports
# TODO! support multi-tpts per actor!
# Bo
if not len(enable_transports) == 1:
raise RuntimeError(
f'No multi-tpt support yet!\n'
f'enable_transports={enable_transports!r}\n'
)
_frame_stack.hide_runtime_frames()
__tracebackhide__: bool = hide_tb
# attempt to retreive ``trio``'s sigint handler and stash it
# on our debugger lock state.
debug.DebugStatus._trio_handler = signal.getsignal(signal.SIGINT)
# mark top most level process as root actor
_state._runtime_vars['_is_root'] = True
# caps based rpc list
enable_modules = (
enable_modules
or
[]
)
if rpc_module_paths:
warnings.warn(
"`rpc_module_paths` is now deprecated, use "
" `enable_modules` instead.",
DeprecationWarning,
stacklevel=2,
)
enable_modules.extend(rpc_module_paths)
if start_method is not None:
_spawn.try_set_start_method(start_method)
# TODO! remove this ASAP!
if arbiter_addr is not None:
warnings.warn(
'`arbiter_addr` is now deprecated\n'
'Use `registry_addrs: list[tuple]` instead..',
DeprecationWarning,
stacklevel=2,
)
uw_reg_addrs = [arbiter_addr]
uw_reg_addrs = registry_addrs
if not uw_reg_addrs:
uw_reg_addrs: list[UnwrappedAddress] = default_lo_addrs(
enable_transports
)
# must exist by now since all below code is dependent
assert uw_reg_addrs
registry_addrs: list[Address] = [
wrap_address(uw_addr)
for uw_addr in uw_reg_addrs
]
loglevel = (
loglevel
or log._default_loglevel
).upper()
# restore built-in `breakpoint()` hook state
if (
debug_mode
and
maybe_enable_greenback
_spawn._spawn_method == 'trio'
):
if builtin_bp_handler is not None:
sys.breakpointhook = builtin_bp_handler
_state._runtime_vars['_debug_mode'] = True
if orig_bp_path is not None:
os.environ['PYTHONBREAKPOINT'] = orig_bp_path
# expose internal debug module to every actor allowing for
# use of ``await tractor.pause()``
enable_modules.append('tractor.devx.debug._tty_lock')
else:
# clear env back to having no entry
os.environ.pop('PYTHONBREAKPOINT', None)
# if debug mode get's enabled *at least* use that level of
# logging for some informative console prompts.
if (
logging.getLevelName(
# lul, need the upper case for the -> int map?
# sweet "dynamic function behaviour" stdlib...
loglevel,
) > logging.getLevelName('PDB')
):
loglevel = 'PDB'
logger.runtime("Root actor terminated")
elif debug_mode:
raise RuntimeError(
"Debug mode is only supported for the `trio` backend!"
)
assert loglevel
_log = log.get_console_log(loglevel)
assert _log
# TODO: factor this into `.devx._stackscope`!!
if (
debug_mode
and
enable_stack_on_sig
):
from .devx._stackscope import enable_stack_on_sig
enable_stack_on_sig()
# closed into below ping task-func
ponged_addrs: list[Address] = []
async def ping_tpt_socket(
addr: Address,
timeout: float = 1,
) -> None:
'''
Attempt temporary connection to see if a registry is
listening at the requested address by a tranport layer
ping.
If a connection can't be made quickly we assume none no
server is listening at that addr.
'''
try:
# TODO: this connect-and-bail forces us to have to
# carefully rewrap TCP 104-connection-reset errors as
# EOF so as to avoid propagating cancel-causing errors
# to the channel-msg loop machinery. Likely it would
# be better to eventually have a "discovery" protocol
# with basic handshake instead?
with trio.move_on_after(timeout):
async with _connect_chan(addr.unwrap()):
ponged_addrs.append(addr)
except OSError:
# ?TODO, make this a "discovery" log level?
logger.info(
f'No root-actor registry found @ {addr!r}\n'
)
# !TODO, this is basically just another (abstract)
# happy-eyeballs, so we should try for formalize it somewhere
# in a `.[_]discovery` ya?
#
async with trio.open_nursery() as tn:
for uw_addr in uw_reg_addrs:
addr: Address = wrap_address(uw_addr)
tn.start_soon(
ping_tpt_socket,
addr,
)
trans_bind_addrs: list[UnwrappedAddress] = []
# Create a new local root-actor instance which IS NOT THE
# REGISTRAR
if ponged_addrs:
if ensure_registry:
raise RuntimeError(
f'Failed to open `{name}`@{ponged_addrs}: '
'registry socket(s) already bound'
)
# we were able to connect to an arbiter
logger.info(
f'Registry(s) seem(s) to exist @ {ponged_addrs}'
)
actor = _runtime.Actor(
name=name or 'anonymous',
uuid=mk_uuid(),
registry_addrs=ponged_addrs,
loglevel=loglevel,
enable_modules=enable_modules,
)
# **DO NOT** use the registry_addrs as the
# ipc-transport-server's bind-addrs as this is
# a new NON-registrar, ROOT-actor.
#
# XXX INSTEAD, bind random addrs using the same tpt
# proto.
for addr in ponged_addrs:
trans_bind_addrs.append(
addr.get_random(
bindspace=addr.bindspace,
)
)
# Start this local actor as the "registrar", aka a regular
# actor who manages the local registry of "mailboxes" of
# other process-tree-local sub-actors.
else:
# NOTE that if the current actor IS THE REGISTAR, the
# following init steps are taken:
# - the tranport layer server is bound to each addr
# pair defined in provided registry_addrs, or the default.
trans_bind_addrs = uw_reg_addrs
# - it is normally desirable for any registrar to stay up
# indefinitely until either all registered (child/sub)
# actors are terminated (via SC supervision) or,
# a re-election process has taken place.
# NOTE: all of ^ which is not implemented yet - see:
# https://github.com/goodboy/tractor/issues/216
# https://github.com/goodboy/tractor/pull/348
# https://github.com/goodboy/tractor/issues/296
# TODO: rename as `RootActor` or is that even necessary?
actor = _runtime.Arbiter(
name=name or 'registrar',
uuid=mk_uuid(),
registry_addrs=registry_addrs,
loglevel=loglevel,
enable_modules=enable_modules,
)
# XXX, in case the root actor runtime was actually run from
# `tractor.to_asyncio.run_as_asyncio_guest()` and NOt
# `.trio.run()`.
actor._infected_aio = _state._runtime_vars['_is_infected_aio']
# NOTE, only set the loopback addr for the
# process-tree-global "root" mailbox since all sub-actors
# should be able to speak to their root actor over that
# channel.
raddrs: list[Address] = _state._runtime_vars['_root_addrs']
raddrs.extend(trans_bind_addrs)
# TODO, remove once we have also removed all usage;
# eventually all (root-)registry apis should expect > 1 addr.
_state._runtime_vars['_root_mailbox'] = raddrs[0]
# Start up main task set via core actor-runtime nurseries.
try:
# assign process-local actor
_state._current_actor = actor
# start local channel-server and fake the portal API
# NOTE: this won't block since we provide the nursery
report: str = f'Starting actor-runtime for {actor.aid.reprol()!r}\n'
if reg_addrs := actor.registry_addrs:
report += (
'-> Opening new registry @ '
+
'\n'.join(
f'{addr}' for addr in reg_addrs
)
)
logger.info(f'{report}\n')
# start runtime in a bg sub-task, yield to caller.
async with (
collapse_eg(),
trio.open_nursery() as root_tn,
# ?TODO? finally-footgun below?
# -> see note on why shielding.
# maybe_raise_from_masking_exc(),
):
actor._root_tn = root_tn
# `_runtime.async_main()` creates an internal nursery
# and blocks here until any underlying actor(-process)
# tree has terminated thereby conducting so called
# "end-to-end" structured concurrency throughout an
# entire hierarchical python sub-process set; all
# "actor runtime" primitives are SC-compat and thus all
# transitively spawned actors/processes must be as
# well.
await root_tn.start(
partial(
_runtime.async_main,
actor,
accept_addrs=trans_bind_addrs,
parent_addr=None
)
)
try:
yield actor
except (
Exception,
BaseExceptionGroup,
) as err:
# TODO, in beginning to handle the subsubactor with
# crashed grandparent cases..
#
# was_locked: bool = await debug.maybe_wait_for_debugger(
# child_in_debug=True,
# )
# XXX NOTE XXX see equiv note inside
# `._runtime.Actor._stream_handler()` where in the
# non-root or root-that-opened-this-mahually case we
# wait for the local actor-nursery to exit before
# exiting the transport channel handler.
entered: bool = await debug._maybe_enter_pm(
err,
api_frame=inspect.currentframe(),
debug_filter=debug_filter,
# XXX NOTE, required to debug root-actor
# crashes under cancellation conditions; so
# most of them!
shield=root_tn.cancel_scope.cancel_called,
)
if (
not entered
and
not is_multi_cancelled(
err,
)
):
logger.exception(
'Root actor crashed\n'
f'>x)\n'
f' |_{actor}\n'
)
# ALWAYS re-raise any error bubbled up from the
# runtime!
raise
finally:
# NOTE/TODO?, not sure if we'll ever need this but it's
# possibly better for even more determinism?
# logger.cancel(
# f'Waiting on {len(nurseries)} nurseries in root..')
# nurseries = actor._actoruid2nursery.values()
# async with trio.open_nursery() as tempn:
# for an in nurseries:
# tempn.start_soon(an.exited.wait)
op_nested_actor_repr: str = _pformat.nest_from_op(
input_op='>) ',
text=actor.pformat(),
nest_prefix='|_',
)
logger.info(
f'Closing down root actor\n'
f'{op_nested_actor_repr}'
)
# XXX, THIS IS A *finally-footgun*!
# (also mentioned in with-block above)
# -> though already shields iternally it can
# taskc here and mask underlying errors raised in
# the try-block above?
with trio.CancelScope(shield=True):
await actor.cancel(None) # self cancel
finally:
# revert all process-global runtime state
if (
debug_mode
and
_spawn._spawn_method == 'trio'
):
_state._runtime_vars['_debug_mode'] = False
_state._current_actor = None
_state._last_actor_terminated = actor
sclang_repr: str = _pformat.nest_from_op(
input_op=')>',
text=actor.pformat(),
nest_prefix='|_',
nest_indent=1,
)
logger.info(
f'Root actor terminated\n'
f'{sclang_repr}'
)
def run_daemon(
@ -462,7 +604,7 @@ def run_daemon(
# runtime kwargs
name: str | None = 'root',
registry_addrs: list[tuple[str, int]] = _default_lo_addrs,
registry_addrs: list[UnwrappedAddress]|None = None,
start_method: str | None = None,
debug_mode: bool = False,

View File

@ -37,12 +37,13 @@ import warnings
import trio
from trio import (
Cancelled,
CancelScope,
Nursery,
TaskStatus,
)
from ._ipc import Channel
from .ipc import Channel
from ._context import (
Context,
)
@ -52,13 +53,18 @@ from ._exceptions import (
ModuleNotExposed,
MsgTypeError,
TransportClosed,
is_multi_cancelled,
pack_error,
unpack_error,
)
from .trionics import (
collapse_eg,
is_multi_cancelled,
maybe_raise_from_masking_exc,
)
from .devx import (
_debug,
debug,
add_div,
pformat as _pformat,
)
from . import _state
from .log import get_logger
@ -67,7 +73,7 @@ from .msg import (
MsgCodec,
PayloadT,
NamespacePath,
# pretty_struct,
pretty_struct,
_ops as msgops,
)
from tractor.msg.types import (
@ -215,11 +221,18 @@ async def _invoke_non_context(
task_status.started(ctx)
result = await coro
fname: str = func.__name__
op_nested_task: str = _pformat.nest_from_op(
input_op=f')> cid: {ctx.cid!r}',
text=f'{ctx._task}',
nest_indent=1, # under >
)
log.runtime(
'RPC complete:\n'
f'task: {ctx._task}\n'
f'|_cid={ctx.cid}\n'
f'|_{fname}() -> {pformat(result)}\n'
f'RPC task complete\n'
f'\n'
f'{op_nested_task}\n'
f'\n'
f')> {fname}() -> {pformat(result)}\n'
)
# NOTE: only send result if we know IPC isn't down
@ -250,7 +263,7 @@ async def _errors_relayed_via_ipc(
ctx: Context,
is_rpc: bool,
hide_tb: bool = False,
hide_tb: bool = True,
debug_kbis: bool = False,
task_status: TaskStatus[
Context | BaseException
@ -266,7 +279,7 @@ async def _errors_relayed_via_ipc(
# TODO: a debug nursery when in debug mode!
# async with maybe_open_debugger_nursery() as debug_tn:
# => see matching comment in side `._debug._pause()`
# => see matching comment in side `.debug._pause()`
rpc_err: BaseException|None = None
try:
yield # run RPC invoke body
@ -318,7 +331,7 @@ async def _errors_relayed_via_ipc(
'RPC task crashed, attempting to enter debugger\n'
f'|_{ctx}'
)
entered_debug = await _debug._maybe_enter_pm(
entered_debug = await debug._maybe_enter_pm(
err,
api_frame=inspect.currentframe(),
)
@ -371,13 +384,13 @@ async def _errors_relayed_via_ipc(
# RPC task bookeeping.
# since RPC tasks are scheduled inside a flat
# `Actor._service_n`, we add "handles" to each such that
# `Actor._service_tn`, we add "handles" to each such that
# they can be individually ccancelled.
finally:
# if the error is not from user code and instead a failure
# of a runtime RPC or transport failure we do prolly want to
# show this frame
# if the error is not from user code and instead a failure of
# an internal-runtime-RPC or IPC-connection, we do (prolly) want
# to show this frame!
if (
rpc_err
and (
@ -449,7 +462,7 @@ async def _invoke(
connected IPC channel.
This is the core "RPC" `trio.Task` scheduling machinery used to start every
remotely invoked function, normally in `Actor._service_n: Nursery`.
remotely invoked function, normally in `Actor._service_tn: Nursery`.
'''
__tracebackhide__: bool = hide_tb
@ -462,7 +475,7 @@ async def _invoke(
):
# XXX for .pause_from_sync()` usage we need to make sure
# `greenback` is boostrapped in the subactor!
await _debug.maybe_init_greenback()
await debug.maybe_init_greenback()
# TODO: possibly a specially formatted traceback
# (not sure what typing is for this..)?
@ -616,32 +629,40 @@ async def _invoke(
# -> the below scope is never exposed to the
# `@context` marked RPC function.
# - `._portal` is never set.
scope_err: BaseException|None = None
try:
tn: trio.Nursery
# TODO: better `trionics` primitive/tooling usage here!
# -[ ] should would be nice to have our `TaskMngr`
# nursery here!
# -[ ] payload value checking like we do with
# `.started()` such that the debbuger can engage
# here in the child task instead of waiting for the
# parent to crash with it's own MTE..
#
tn: Nursery
rpc_ctx_cs: CancelScope
async with (
trio.open_nursery(
strict_exception_groups=False,
# ^XXX^ TODO? instead unpack any RAE as per "loose" style?
) as tn,
collapse_eg(hide_tb=False),
trio.open_nursery() as tn,
msgops.maybe_limit_plds(
ctx=ctx,
spec=ctx_meta.get('pld_spec'),
dec_hook=ctx_meta.get('dec_hook'),
),
# XXX NOTE, this being the "most embedded"
# scope ensures unasking of the `await coro` below
# *should* never be interfered with!!
maybe_raise_from_masking_exc(
tn=tn,
unmask_from=Cancelled,
) as _mbme, # maybe boxed masked exc
):
ctx._scope_nursery = tn
rpc_ctx_cs = ctx._scope = tn.cancel_scope
task_status.started(ctx)
# TODO: better `trionics` tooling:
# -[ ] should would be nice to have our `TaskMngr`
# nursery here!
# -[ ] payload value checking like we do with
# `.started()` such that the debbuger can engage
# here in the child task instead of waiting for the
# parent to crash with it's own MTE..
# invoke user endpoint fn.
res: Any|PayloadT = await coro
return_msg: Return|CancelAck = return_msg_type(
cid=cid,
@ -651,7 +672,8 @@ async def _invoke(
ctx._result = res
log.runtime(
f'Sending result msg and exiting {ctx.side!r}\n'
f'{return_msg}\n'
f'\n'
f'{pretty_struct.pformat(return_msg)}\n'
)
await chan.send(return_msg)
@ -743,43 +765,52 @@ async def _invoke(
BaseExceptionGroup,
BaseException,
trio.Cancelled,
) as scope_error:
) as _scope_err:
scope_err = _scope_err
if (
isinstance(scope_error, RuntimeError)
and scope_error.args
and 'Cancel scope stack corrupted' in scope_error.args[0]
isinstance(scope_err, RuntimeError)
and
scope_err.args
and
'Cancel scope stack corrupted' in scope_err.args[0]
):
log.exception('Cancel scope stack corrupted!?\n')
# _debug.mk_pdb().set_trace()
# debug.mk_pdb().set_trace()
# always set this (child) side's exception as the
# local error on the context
ctx._local_error: BaseException = scope_error
ctx._local_error: BaseException = scope_err
# ^-TODO-^ question,
# does this matter other then for
# consistentcy/testing?
# |_ no user code should be in this scope at this point
# AND we already set this in the block below?
# if a remote error was set then likely the
# exception group was raised due to that, so
# XXX if a remote error was set then likely the
# exc group was raised due to that, so
# and we instead raise that error immediately!
ctx.maybe_raise()
maybe_re: (
ContextCancelled|RemoteActorError
) = ctx.maybe_raise()
if maybe_re:
log.cancel(
f'Suppressing remote-exc from peer,\n'
f'{maybe_re!r}\n'
)
# maybe TODO: pack in come kinda
# `trio.Cancelled.__traceback__` here so they can be
# unwrapped and displayed on the caller side? no se..
raise
raise scope_err
# `@context` entrypoint task bookeeping.
# i.e. only pop the context tracking if used ;)
finally:
assert chan.uid
assert chan.aid
# don't pop the local context until we know the
# associated child isn't in debug any more
await _debug.maybe_wait_for_debugger()
await debug.maybe_wait_for_debugger()
ctx: Context = actor._contexts.pop((
chan.uid,
cid,
@ -792,26 +823,49 @@ async def _invoke(
f'after having {ctx.repr_state!r}\n'
)
if merr:
logmeth: Callable = log.error
if isinstance(merr, ContextCancelled):
logmeth: Callable = log.runtime
if (
# ctxc: by `Context.cancel()`
isinstance(merr, ContextCancelled)
if not isinstance(merr, RemoteActorError):
tb_str: str = ''.join(traceback.format_exception(merr))
# out-of-layer cancellation, one of:
# - actorc: by `Portal.cancel_actor()`
# - OSc: by SIGINT or `Process.signal()`
or (
isinstance(merr, trio.Cancelled)
and
ctx.canceller
)
):
logmeth: Callable = log.cancel
descr_str += (
f' with {merr!r}\n'
)
elif (
not isinstance(merr, RemoteActorError)
):
tb_str: str = ''.join(
traceback.format_exception(merr)
)
descr_str += (
f'\n{merr!r}\n' # needed?
f'{tb_str}\n'
)
else:
descr_str += f'\n{merr!r}\n'
descr_str += (
f'{merr!r}\n'
)
else:
descr_str += f'\nand final result {ctx.outcome!r}\n'
descr_str += (
f'\n'
f'with final result {ctx.outcome!r}\n'
)
logmeth(
message
+
descr_str
f'{message}\n'
f'\n'
f'{descr_str}\n'
)
@ -869,7 +923,6 @@ async def try_ship_error_to_remote(
async def process_messages(
actor: Actor,
chan: Channel,
shield: bool = False,
task_status: TaskStatus[CancelScope] = trio.TASK_STATUS_IGNORED,
@ -883,7 +936,7 @@ async def process_messages(
Receive (multiplexed) per-`Channel` RPC requests as msgs from
remote processes; schedule target async funcs as local
`trio.Task`s inside the `Actor._service_n: Nursery`.
`trio.Task`s inside the `Actor._service_tn: Nursery`.
Depending on msg type, non-`cmd` (task spawning/starting)
request payloads (eg. `started`, `yield`, `return`, `error`)
@ -907,7 +960,8 @@ async def process_messages(
(as utilized inside `Portal.cancel_actor()` ).
'''
assert actor._service_n # runtime state sanity
actor: Actor = _state.current_actor()
assert actor._service_tn # runtime state sanity
# TODO: once `trio` get's an "obvious way" for req/resp we
# should use it?
@ -978,12 +1032,10 @@ async def process_messages(
cid=cid,
kwargs=kwargs,
):
kwargs |= {'req_chan': chan}
# XXX NOTE XXX don't start entire actor
# runtime cancellation if this actor is
# currently in debug mode!
pdb_complete: trio.Event|None = _debug.DebugStatus.repl_release
pdb_complete: trio.Event|None = debug.DebugStatus.repl_release
if pdb_complete:
await pdb_complete.wait()
@ -998,14 +1050,14 @@ async def process_messages(
cid,
chan,
actor.cancel,
kwargs,
kwargs | {'req_chan': chan},
is_rpc=False,
return_msg_type=CancelAck,
)
log.runtime(
'Cancelling IPC transport msg-loop with peer:\n'
f'|_{chan}\n'
'Cancelling RPC-msg-loop with peer\n'
f'->c}} {chan.aid.reprol()}@[{chan.maddr}]\n'
)
loop_cs.cancel()
break
@ -1018,7 +1070,7 @@ async def process_messages(
):
target_cid: str = kwargs['cid']
kwargs |= {
'requesting_uid': chan.uid,
'requesting_aid': chan.aid,
'ipc_msg': msg,
# XXX NOTE! ONLY the rpc-task-owning
@ -1054,21 +1106,34 @@ async def process_messages(
ns=ns,
func=funcname,
kwargs=kwargs, # type-spec this? see `msg.types`
uid=actorid,
uid=actor_uuid,
):
if actor_uuid != chan.aid.uid:
raise RuntimeError(
f'IPC <Start> msg <-> chan.aid mismatch!?\n'
f'Channel.aid = {chan.aid!r}\n'
f'Start.uid = {actor_uuid!r}\n'
)
# await debug.pause()
op_repr: str = 'Start <=) '
req_repr: str = _pformat.nest_from_op(
input_op=op_repr,
op_suffix='',
nest_prefix='',
text=f'{chan}',
nest_indent=len(op_repr)-1,
rm_from_first_ln='<',
# ^XXX, subtract -1 to account for
# <Channel
# ^_chevron to be stripped
)
start_status: str = (
'Handling RPC `Start` request\n'
f'<= peer: {actorid}\n\n'
f' |_{chan}\n'
f' |_cid: {cid}\n\n'
# f' |_{ns}.{funcname}({kwargs})\n'
f'>> {actor.uid}\n'
f' |_{actor}\n'
f' -> nsp: `{ns}.{funcname}({kwargs})`\n'
# f' |_{ns}.{funcname}({kwargs})\n\n'
# f'{pretty_struct.pformat(msg)}\n'
'Handling RPC request\n'
f'{req_repr}\n'
f'\n'
f'->{{ ipc-context-id: {cid!r}\n'
f'->{{ nsp for fn: `{ns}.{funcname}({kwargs})`\n'
)
# runtime-internal endpoint: `Actor.<funcname>`
@ -1097,10 +1162,6 @@ async def process_messages(
await chan.send(err_msg)
continue
start_status += (
f' -> func: {func}\n'
)
# schedule a task for the requested RPC function
# in the actor's main "service nursery".
#
@ -1108,10 +1169,10 @@ async def process_messages(
# supervision isolation? would avoid having to
# manage RPC tasks individually in `._rpc_tasks`
# table?
start_status += ' -> scheduling new task..\n'
start_status += '->( scheduling new task..\n'
log.runtime(start_status)
try:
ctx: Context = await actor._service_n.start(
ctx: Context = await actor._service_tn.start(
partial(
_invoke,
actor,
@ -1156,7 +1217,7 @@ async def process_messages(
trio.Event(),
)
# runtime-scoped remote (internal) error
# XXX RUNTIME-SCOPED! remote (likely internal) error
# (^- bc no `Error.cid` -^)
#
# NOTE: this is the non-rpc error case, that
@ -1192,12 +1253,24 @@ async def process_messages(
# END-OF `async for`:
# IPC disconnected via `trio.EndOfChannel`, likely
# due to a (graceful) `Channel.aclose()`.
chan_op_repr: str = '<=x] '
chan_repr: str = _pformat.nest_from_op(
input_op=chan_op_repr,
op_suffix='',
nest_prefix='',
text=chan.pformat(),
nest_indent=len(chan_op_repr)-1,
rm_from_first_ln='<',
)
log.runtime(
f'channel for {chan.uid} disconnected, cancelling RPC tasks\n'
f'|_{chan}\n'
f'IPC channel disconnected\n'
f'{chan_repr}\n'
f'\n'
f'->c) cancelling RPC tasks.\n'
)
await actor.cancel_rpc_tasks(
req_uid=actor.uid,
req_aid=actor.aid,
# a "self cancel" in terms of the lifetime of the
# IPC connection which is presumed to be the
# source of any requests for spawned tasks.
@ -1219,8 +1292,10 @@ async def process_messages(
# -[ ] figure out how this will break with other transports?
tc.report_n_maybe_raise(
message=(
f'peer IPC channel closed abruptly?\n\n'
f'<=x {chan}\n'
f'peer IPC channel closed abruptly?\n'
f'\n'
f'<=x[\n'
f' {chan}\n'
f' |_{chan.raddr}\n\n'
)
+
@ -1237,7 +1312,7 @@ async def process_messages(
) as err:
if nursery_cancelled_before_task:
sn: Nursery = actor._service_n
sn: Nursery = actor._service_tn
assert sn and sn.cancel_scope.cancel_called # sanity
log.cancel(
f'Service nursery cancelled before it handled {funcname}'
@ -1267,13 +1342,37 @@ async def process_messages(
finally:
# msg debugging for when he machinery is brokey
if msg is None:
message: str = 'Exiting IPC msg loop without receiving a msg?'
message: str = 'Exiting RPC-loop without receiving a msg?'
else:
task_op_repr: str = ')>'
task: trio.Task = trio.lowlevel.current_task()
# maybe add cancelled opt prefix
if task._cancel_status.effectively_cancelled:
task_op_repr = 'c' + task_op_repr
task_repr: str = _pformat.nest_from_op(
input_op=task_op_repr,
text=f'{task!r}',
nest_indent=1,
)
# chan_op_repr: str = '<=} '
# chan_repr: str = _pformat.nest_from_op(
# input_op=chan_op_repr,
# op_suffix='',
# nest_prefix='',
# text=chan.pformat(),
# nest_indent=len(chan_op_repr)-1,
# rm_from_first_ln='<',
# )
message: str = (
'Exiting IPC msg loop with final msg\n\n'
f'<= peer: {chan.uid}\n'
f' |_{chan}\n\n'
# f'{pretty_struct.pformat(msg)}'
f'Exiting RPC-loop with final msg\n'
f'\n'
# f'{chan_repr}\n'
f'{task_repr}\n'
f'\n'
f'{pretty_struct.pformat(msg)}'
f'\n'
)
log.runtime(message)

File diff suppressed because it is too large Load Diff

View File

@ -34,9 +34,9 @@ from typing import (
import trio
from trio import TaskStatus
from .devx._debug import (
maybe_wait_for_debugger,
acquire_debug_lock,
from .devx import (
debug,
pformat as _pformat
)
from tractor._state import (
current_actor,
@ -46,19 +46,26 @@ from tractor._state import (
_runtime_vars,
)
from tractor.log import get_logger
from tractor._addr import UnwrappedAddress
from tractor._portal import Portal
from tractor._runtime import Actor
from tractor._entry import _mp_main
from tractor._exceptions import ActorFailure
from tractor.msg.types import (
SpawnSpec,
from tractor.msg import (
types as msgtypes,
pretty_struct,
)
if TYPE_CHECKING:
from ipc import (
_server,
Channel,
)
from ._supervise import ActorNursery
ProcessType = TypeVar('ProcessType', mp.Process, trio.Process)
log = get_logger('tractor')
# placeholder for an mp start context if so using that backend
@ -163,7 +170,7 @@ async def exhaust_portal(
# TODO: merge with above?
log.warning(
'Cancelled portal result waiter task:\n'
f'uid: {portal.channel.uid}\n'
f'uid: {portal.channel.aid}\n'
f'error: {err}\n'
)
return err
@ -171,7 +178,7 @@ async def exhaust_portal(
else:
log.debug(
f'Returning final result from portal:\n'
f'uid: {portal.channel.uid}\n'
f'uid: {portal.channel.aid}\n'
f'result: {final}\n'
)
return final
@ -229,10 +236,6 @@ async def hard_kill(
# whilst also hacking on it XD
# terminate_after: int = 99999,
# NOTE: for mucking with `.pause()`-ing inside the runtime
# whilst also hacking on it XD
# terminate_after: int = 99999,
) -> None:
'''
Un-gracefully terminate an OS level `trio.Process` after timeout.
@ -294,6 +297,23 @@ async def hard_kill(
# zombies (as a feature) we ask the OS to do send in the
# removal swad as the last resort.
if cs.cancelled_caught:
# TODO? attempt at intermediary-rent-sub
# with child in debug lock?
# |_https://github.com/goodboy/tractor/issues/320
#
# if not is_root_process():
# log.warning(
# 'Attempting to acquire debug-REPL-lock before zombie reap!'
# )
# with trio.CancelScope(shield=True):
# async with debug.acquire_debug_lock(
# subactor_uid=current_actor().uid,
# ) as _ctx:
# log.warning(
# 'Acquired debug lock, child ready to be killed ??\n'
# )
# TODO: toss in the skynet-logo face as ascii art?
log.critical(
# 'Well, the #ZOMBIE_LORD_IS_HERE# to collect\n'
@ -324,20 +344,21 @@ async def soft_kill(
see `.hard_kill()`).
'''
uid: tuple[str, str] = portal.channel.uid
chan: Channel = portal.channel
peer_aid: msgtypes.Aid = chan.aid
try:
log.cancel(
f'Soft killing sub-actor via portal request\n'
f'\n'
f'(c=> {portal.chan.uid}\n'
f' |_{proc}\n'
f'c)=> {peer_aid.reprol()}@[{chan.maddr}]\n'
f' |_{proc}\n'
)
# wait on sub-proc to signal termination
await wait_func(proc)
except trio.Cancelled:
with trio.CancelScope(shield=True):
await maybe_wait_for_debugger(
await debug.maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False
),
@ -378,7 +399,7 @@ async def soft_kill(
if proc.poll() is None: # type: ignore
log.warning(
'Subactor still alive after cancel request?\n\n'
f'uid: {uid}\n'
f'uid: {peer_aid}\n'
f'|_{proc}\n'
)
n.cancel_scope.cancel()
@ -392,14 +413,15 @@ async def new_proc(
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addrs: list[tuple[str, int]],
parent_addr: tuple[str, int],
bind_addrs: list[UnwrappedAddress],
parent_addr: UnwrappedAddress,
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
infect_asyncio: bool = False,
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED,
proc_kwargs: dict[str, any] = {}
) -> None:
@ -419,6 +441,7 @@ async def new_proc(
_runtime_vars, # run time vars
infect_asyncio=infect_asyncio,
task_status=task_status,
proc_kwargs=proc_kwargs
)
@ -429,12 +452,13 @@ async def trio_proc(
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addrs: list[tuple[str, int]],
parent_addr: tuple[str, int],
bind_addrs: list[UnwrappedAddress],
parent_addr: UnwrappedAddress,
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
infect_asyncio: bool = False,
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED,
proc_kwargs: dict[str, any] = {}
) -> None:
'''
@ -456,6 +480,9 @@ async def trio_proc(
# the OS; it otherwise can be passed via the parent channel if
# we prefer in the future (for privacy).
"--uid",
# TODO, how to pass this over "wire" encodings like
# cmdline args?
# -[ ] maybe we can add an `msgtypes.Aid.min_tuple()` ?
str(subactor.uid),
# Address the child must connect to on startup
"--parent_addr",
@ -473,18 +500,20 @@ async def trio_proc(
cancelled_during_spawn: bool = False
proc: trio.Process|None = None
ipc_server: _server.Server = actor_nursery._actor.ipc_server
try:
try:
proc: trio.Process = await trio.lowlevel.open_process(spawn_cmd)
proc: trio.Process = await trio.lowlevel.open_process(spawn_cmd, **proc_kwargs)
log.runtime(
'Started new child\n'
f'|_{proc}\n'
f'Started new child subproc\n'
f'(>\n'
f' |_{proc}\n'
)
# wait for actor to spawn and connect back to us
# channel should have handshake completed by the
# local actor by the time we get a ref to it
event, chan = await actor_nursery._actor.wait_for_peer(
event, chan = await ipc_server.wait_for_peer(
subactor.uid
)
@ -496,10 +525,10 @@ async def trio_proc(
with trio.CancelScope(shield=True):
# don't clobber an ongoing pdb
if is_root_process():
await maybe_wait_for_debugger()
await debug.maybe_wait_for_debugger()
elif proc is not None:
async with acquire_debug_lock(subactor.uid):
async with debug.acquire_debug_lock(subactor.uid):
# soft wait on the proc to terminate
with trio.move_on_after(0.5):
await proc.wait()
@ -517,15 +546,20 @@ async def trio_proc(
# send a "spawning specification" which configures the
# initial runtime state of the child.
await chan.send(
SpawnSpec(
_parent_main_data=subactor._parent_main_data,
enable_modules=subactor.enable_modules,
reg_addrs=subactor.reg_addrs,
bind_addrs=bind_addrs,
_runtime_vars=_runtime_vars,
)
sspec = msgtypes.SpawnSpec(
_parent_main_data=subactor._parent_main_data,
enable_modules=subactor.enable_modules,
reg_addrs=subactor.reg_addrs,
bind_addrs=bind_addrs,
_runtime_vars=_runtime_vars,
)
log.runtime(
f'Sending spawn spec to child\n'
f'{{}}=> {chan.aid.reprol()!r}\n'
f'\n'
f'{pretty_struct.pformat(sspec)}\n'
)
await chan.send(sspec)
# track subactor in current nursery
curr_actor: Actor = current_actor()
@ -552,7 +586,7 @@ async def trio_proc(
# condition.
await soft_kill(
proc,
trio.Process.wait,
trio.Process.wait, # XXX, uses `pidfd_open()` below.
portal
)
@ -560,8 +594,7 @@ async def trio_proc(
# tandem if not done already
log.cancel(
'Cancelling portal result reaper task\n'
f'>c)\n'
f' |_{subactor.uid}\n'
f'c)> {subactor.aid.reprol()!r}\n'
)
nursery.cancel_scope.cancel()
@ -570,21 +603,24 @@ async def trio_proc(
# allowed! Do this **after** cancellation/teardown to avoid
# killing the process too early.
if proc:
reap_repr: str = _pformat.nest_from_op(
input_op='>x)',
text=subactor.pformat(),
)
log.cancel(
f'Hard reap sequence starting for subactor\n'
f'>x)\n'
f' |_{subactor}@{subactor.uid}\n'
f'{reap_repr}'
)
with trio.CancelScope(shield=True):
# don't clobber an ongoing pdb
if cancelled_during_spawn:
# Try again to avoid TTY clobbering.
async with acquire_debug_lock(subactor.uid):
async with debug.acquire_debug_lock(subactor.uid):
with trio.move_on_after(0.5):
await proc.wait()
await maybe_wait_for_debugger(
await debug.maybe_wait_for_debugger(
child_in_debug=_runtime_vars.get(
'_debug_mode', False
),
@ -613,7 +649,7 @@ async def trio_proc(
# acquire the lock and get notified of who has it,
# check that uid against our known children?
# this_uid: tuple[str, str] = current_actor().uid
# await acquire_debug_lock(this_uid)
# await debug.acquire_debug_lock(this_uid)
if proc.poll() is None:
log.cancel(f"Attempting to hard kill {proc}")
@ -635,12 +671,13 @@ async def mp_proc(
subactor: Actor,
errors: dict[tuple[str, str], Exception],
# passed through to actor main
bind_addrs: list[tuple[str, int]],
parent_addr: tuple[str, int],
bind_addrs: list[UnwrappedAddress],
parent_addr: UnwrappedAddress,
_runtime_vars: dict[str, Any], # serialized and sent to _child
*,
infect_asyncio: bool = False,
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED
task_status: TaskStatus[Portal] = trio.TASK_STATUS_IGNORED,
proc_kwargs: dict[str, any] = {}
) -> None:
@ -715,12 +752,14 @@ async def mp_proc(
log.runtime(f"Started {proc}")
ipc_server: _server.Server = actor_nursery._actor.ipc_server
try:
# wait for actor to spawn and connect back to us
# channel should have handshake completed by the
# local actor by the time we get a ref to it
event, chan = await actor_nursery._actor.wait_for_peer(
subactor.uid)
event, chan = await ipc_server.wait_for_peer(
subactor.uid,
)
# XXX: monkey patch poll API to match the ``subprocess`` API..
# not sure why they don't expose this but kk.

View File

@ -14,16 +14,19 @@
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Per process state
'''
Per actor-process runtime state mgmt APIs.
"""
'''
from __future__ import annotations
from contextvars import (
ContextVar,
)
import os
from pathlib import Path
from typing import (
Any,
Literal,
TYPE_CHECKING,
)
@ -34,20 +37,39 @@ if TYPE_CHECKING:
from ._context import Context
# default IPC transport protocol settings
TransportProtocolKey = Literal[
'tcp',
'uds',
]
_def_tpt_proto: TransportProtocolKey = 'tcp'
_current_actor: Actor|None = None # type: ignore # noqa
_last_actor_terminated: Actor|None = None
# TODO: mk this a `msgspec.Struct`!
# -[ ] type out all fields obvi!
# -[ ] (eventually) mk wire-ready for monitoring?
_runtime_vars: dict[str, Any] = {
'_debug_mode': False,
'_is_root': False,
'_root_mailbox': (None, None),
# root of actor-process tree info
'_is_root': False, # bool
'_root_mailbox': (None, None), # tuple[str|None, str|None]
'_root_addrs': [], # tuple[str|None, str|None]
# parent->chld ipc protocol caps
'_enable_tpts': [_def_tpt_proto],
# registrar info
'_registry_addrs': [],
'_is_infected_aio': False,
# `debug_mode: bool` settings
'_debug_mode': False, # bool
'repl_fixture': False, # |AbstractContextManager[bool]
# for `tractor.pause_from_sync()` & `breakpoint()` support
'use_greenback': False,
# infected-`asyncio`-mode: `trio` running as guest.
'_is_infected_aio': False,
}
@ -99,7 +121,7 @@ def current_actor(
return _current_actor
def is_main_process() -> bool:
def is_root_process() -> bool:
'''
Bool determining if this actor is running in the top-most process.
@ -108,8 +130,10 @@ def is_main_process() -> bool:
return mp.current_process().name == 'MainProcess'
# TODO, more verby name?
def debug_mode() -> bool:
is_main_process = is_root_process
def is_debug_mode() -> bool:
'''
Bool determining if "debug mode" is on which enables
remote subactor pdb entry on crashes.
@ -118,6 +142,9 @@ def debug_mode() -> bool:
return bool(_runtime_vars['_debug_mode'])
debug_mode = is_debug_mode
def is_root_process() -> bool:
return _runtime_vars['_is_root']
@ -143,3 +170,34 @@ def current_ipc_ctx(
f'|_{current_task()}\n'
)
return ctx
# std ODE (mutable) app state location
_rtdir: Path = Path(os.environ['XDG_RUNTIME_DIR'])
def get_rt_dir(
subdir: str = 'tractor'
) -> Path:
'''
Return the user "runtime dir" where most userspace apps stick
their IPC and cache related system util-files; we take hold
of a `'XDG_RUNTIME_DIR'/tractor/` subdir by default.
'''
rtdir: Path = _rtdir / subdir
if not rtdir.is_dir():
rtdir.mkdir()
return rtdir
def current_ipc_protos() -> list[str]:
'''
Return the list of IPC transport protocol keys currently
in use by this actor.
The keys are as declared by `MsgTransport` and `Address`
concrete-backend sub-types defined throughout `tractor.ipc`.
'''
return _runtime_vars['_enable_tpts']

View File

@ -56,7 +56,7 @@ from tractor.msg import (
if TYPE_CHECKING:
from ._runtime import Actor
from ._context import Context
from ._ipc import Channel
from .ipc import Channel
log = get_logger(__name__)
@ -426,8 +426,8 @@ class MsgStream(trio.abc.Channel):
self._closed = re
# if caught_eoc:
# # from .devx import _debug
# # await _debug.pause()
# # from .devx import debug
# # await debug.pause()
# with trio.CancelScope(shield=True):
# await rx_chan.aclose()
@ -437,22 +437,23 @@ class MsgStream(trio.abc.Channel):
message: str = (
f'Stream self-closed by {this_side!r}-side before EoC from {peer_side!r}\n'
# } bc a stream is a "scope"/msging-phase inside an IPC
f'x}}>\n'
f'c}}>\n'
f' |_{self}\n'
)
log.cancel(message)
self._eoc = trio.EndOfChannel(message)
if (
(rx_chan := self._rx_chan)
and
(stats := rx_chan.statistics()).tasks_waiting_receive
):
log.cancel(
f'Msg-stream is closing but there is still reader tasks,\n'
message += (
f'AND there is still reader tasks,\n'
f'\n'
f'{stats}\n'
)
log.cancel(message)
self._eoc = trio.EndOfChannel(message)
# ?XXX WAIT, why do we not close the local mem chan `._rx_chan` XXX?
# => NO, DEFINITELY NOT! <=
# if we're a bi-dir `MsgStream` BECAUSE this same
@ -595,8 +596,17 @@ class MsgStream(trio.abc.Channel):
trio.ClosedResourceError,
trio.BrokenResourceError,
BrokenPipeError,
) as trans_err:
if hide_tb:
) as _trans_err:
trans_err = _trans_err
if (
hide_tb
and
self._ctx.chan._exc is trans_err
# ^XXX, IOW, only if the channel is marked errored
# for the same reason as whatever its underlying
# transport raised, do we keep the full low-level tb
# suppressed from the user.
):
raise type(trans_err)(
*trans_err.args
) from trans_err
@ -802,13 +812,12 @@ async def open_stream_from_ctx(
# sanity, can remove?
assert eoc is stream._eoc
log.warning(
log.runtime(
'Stream was terminated by EoC\n\n'
# NOTE: won't show the error <Type> but
# does show txt followed by IPC msg.
f'{str(eoc)}\n'
)
finally:
if ctx._portal:
try:

View File

@ -21,34 +21,49 @@
from contextlib import asynccontextmanager as acm
from functools import partial
import inspect
from pprint import pformat
from typing import TYPE_CHECKING
from typing import (
TYPE_CHECKING,
)
import typing
import warnings
import trio
from .devx._debug import maybe_wait_for_debugger
from .devx import (
debug,
pformat as _pformat,
)
from ._addr import (
UnwrappedAddress,
mk_uuid,
)
from ._state import current_actor, is_main_process
from .log import get_logger, get_loglevel
from ._runtime import Actor
from ._portal import Portal
from ._exceptions import (
from .trionics import (
is_multi_cancelled,
collapse_eg,
)
from ._exceptions import (
ContextCancelled,
)
from ._root import open_root_actor
from ._root import (
open_root_actor,
)
from . import _state
from . import _spawn
if TYPE_CHECKING:
import multiprocessing as mp
# from .ipc._server import IPCServer
from .ipc import IPCServer
log = get_logger(__name__)
_default_bind_addr: tuple[str, int] = ('127.0.0.1', 0)
class ActorNursery:
'''
@ -102,7 +117,6 @@ class ActorNursery:
]
] = {}
self.cancelled: bool = False
self._join_procs = trio.Event()
self._at_least_one_child_in_debug: bool = False
self.errors = errors
@ -120,18 +134,62 @@ class ActorNursery:
# TODO: remove the `.run_in_actor()` API and thus this 2ndary
# nursery when that API get's moved outside this primitive!
self._ria_nursery = ria_nursery
# TODO, factor this into a .hilevel api!
#
# portals spawned with ``run_in_actor()`` are
# cancelled when their "main" result arrives
self._cancel_after_result_on_exit: set = set()
# trio.Nursery-like cancel (request) statuses
self._cancelled_caught: bool = False
self._cancel_called: bool = False
@property
def cancel_called(self) -> bool:
'''
Records whether cancellation has been requested for this
actor-nursery by a call to `.cancel()` either due to,
- an explicit call by some actor-local-task,
- an implicit call due to an error/cancel emited inside
the `tractor.open_nursery()` block.
'''
return self._cancel_called
@property
def cancelled_caught(self) -> bool:
'''
Set when this nursery was able to cance all spawned subactors
gracefully via an (implicit) call to `.cancel()`.
'''
return self._cancelled_caught
# TODO! remove internal/test-suite usage!
@property
def cancelled(self) -> bool:
warnings.warn(
"`ActorNursery.cancelled` is now deprecated, use "
" `.cancel_called` instead.",
DeprecationWarning,
stacklevel=2,
)
return (
self._cancel_called
# and
# self._cancelled_caught
)
async def start_actor(
self,
name: str,
*,
bind_addrs: list[tuple[str, int]] = [_default_bind_addr],
bind_addrs: list[UnwrappedAddress]|None = None,
rpc_module_paths: list[str]|None = None,
enable_transports: list[str] = [_state._def_tpt_proto],
enable_modules: list[str]|None = None,
loglevel: str|None = None, # set log level per subactor
debug_mode: bool|None = None,
@ -141,6 +199,7 @@ class ActorNursery:
# a `._ria_nursery` since the dependent APIs have been
# removed!
nursery: trio.Nursery|None = None,
proc_kwargs: dict[str, any] = {}
) -> Portal:
'''
@ -177,15 +236,17 @@ class ActorNursery:
enable_modules.extend(rpc_module_paths)
subactor = Actor(
name,
name=name,
uuid=mk_uuid(),
# modules allowed to invoked funcs from
enable_modules=enable_modules,
loglevel=loglevel,
# verbatim relay this actor's registrar addresses
registry_addrs=current_actor().reg_addrs,
registry_addrs=current_actor().registry_addrs,
)
parent_addr = self._actor.accept_addr
parent_addr: UnwrappedAddress = self._actor.accept_addr
assert parent_addr
# start a task to spawn a process
@ -204,6 +265,7 @@ class ActorNursery:
parent_addr,
_rtv, # run time vars
infect_asyncio=infect_asyncio,
proc_kwargs=proc_kwargs
)
)
@ -222,11 +284,12 @@ class ActorNursery:
*,
name: str | None = None,
bind_addrs: tuple[str, int] = [_default_bind_addr],
bind_addrs: UnwrappedAddress|None = None,
rpc_module_paths: list[str] | None = None,
enable_modules: list[str] | None = None,
loglevel: str | None = None, # set log level per subactor
infect_asyncio: bool = False,
proc_kwargs: dict[str, any] = {},
**kwargs, # explicit args to ``fn``
@ -257,6 +320,7 @@ class ActorNursery:
# use the run_in_actor nursery
nursery=self._ria_nursery,
infect_asyncio=infect_asyncio,
proc_kwargs=proc_kwargs
)
# XXX: don't allow stream funcs
@ -294,15 +358,21 @@ class ActorNursery:
'''
__runtimeframe__: int = 1 # noqa
self.cancelled = True
self._cancel_called = True
# TODO: impl a repr for spawn more compact
# then `._children`..
children: dict = self._children
child_count: int = len(children)
msg: str = f'Cancelling actor nursery with {child_count} children\n'
server: IPCServer = self._actor.ipc_server
with trio.move_on_after(3) as cs:
async with trio.open_nursery() as tn:
async with (
collapse_eg(),
trio.open_nursery() as tn,
):
subactor: Actor
proc: trio.Process
@ -321,7 +391,7 @@ class ActorNursery:
else:
if portal is None: # actor hasn't fully spawned yet
event = self._actor._peer_connected[subactor.uid]
event: trio.Event = server._peer_connected[subactor.uid]
log.warning(
f"{subactor.uid} never 't finished spawning?"
)
@ -337,7 +407,7 @@ class ActorNursery:
if portal is None:
# cancelled while waiting on the event
# to arrive
chan = self._actor._peers[subactor.uid][-1]
chan = server._peers[subactor.uid][-1]
if chan:
portal = Portal(chan)
else: # there's no other choice left
@ -366,6 +436,8 @@ class ActorNursery:
) in children.values():
log.warning(f"Hard killing process {proc}")
proc.terminate()
else:
self._cancelled_caught
# mark ourselves as having (tried to have) cancelled all subactors
self._join_procs.set()
@ -395,10 +467,10 @@ async def _open_and_supervise_one_cancels_all_nursery(
# `ActorNursery.start_actor()`).
# errors from this daemon actor nursery bubble up to caller
async with trio.open_nursery(
strict_exception_groups=False,
# ^XXX^ TODO? instead unpack any RAE as per "loose" style?
) as da_nursery:
async with (
collapse_eg(),
trio.open_nursery() as da_nursery,
):
try:
# This is the inner level "run in actor" nursery. It is
# awaited first since actors spawned in this way (using
@ -408,11 +480,10 @@ async def _open_and_supervise_one_cancels_all_nursery(
# immediately raised for handling by a supervisor strategy.
# As such if the strategy propagates any error(s) upwards
# the above "daemon actor" nursery will be notified.
async with trio.open_nursery(
strict_exception_groups=False,
# ^XXX^ TODO? instead unpack any RAE as per "loose" style?
) as ria_nursery:
async with (
collapse_eg(),
trio.open_nursery() as ria_nursery,
):
an = ActorNursery(
actor,
ria_nursery,
@ -429,7 +500,7 @@ async def _open_and_supervise_one_cancels_all_nursery(
# the "hard join phase".
log.runtime(
'Waiting on subactors to complete:\n'
f'{pformat(an._children)}\n'
f'>}} {len(an._children)}\n'
)
an._join_procs.set()
@ -443,7 +514,7 @@ async def _open_and_supervise_one_cancels_all_nursery(
# will make the pdb repl unusable.
# Instead try to wait for pdb to be released before
# tearing down.
await maybe_wait_for_debugger(
await debug.maybe_wait_for_debugger(
child_in_debug=an._at_least_one_child_in_debug
)
@ -519,7 +590,7 @@ async def _open_and_supervise_one_cancels_all_nursery(
# XXX: yet another guard before allowing the cancel
# sequence in case a (single) child is in debug.
await maybe_wait_for_debugger(
await debug.maybe_wait_for_debugger(
child_in_debug=an._at_least_one_child_in_debug
)
@ -568,9 +639,15 @@ async def _open_and_supervise_one_cancels_all_nursery(
# final exit
_shutdown_msg: str = (
'Actor-runtime-shutdown'
)
@acm
# @api_frame
async def open_nursery(
*, # named params only!
hide_tb: bool = True,
**kwargs,
# ^TODO, paramspec for `open_root_actor()`
@ -655,17 +732,26 @@ async def open_nursery(
):
__tracebackhide__: bool = False
msg: str = (
'Actor-nursery exited\n'
f'|_{an}\n'
op_nested_an_repr: str = _pformat.nest_from_op(
input_op=')>',
text=f'{an}',
# nest_prefix='|_',
nest_indent=1, # under >
)
an_msg: str = (
f'Actor-nursery exited\n'
f'{op_nested_an_repr}\n'
)
# keep noise low during std operation.
log.runtime(an_msg)
if implicit_runtime:
# shutdown runtime if it was started and report noisly
# that we're did so.
msg += '=> Shutting down actor runtime <=\n'
msg: str = (
'\n'
'\n'
f'{_shutdown_msg} )>\n'
)
log.info(msg)
else:
# keep noise low during std operation.
log.runtime(msg)

View File

@ -26,7 +26,7 @@ import os
import pathlib
import tractor
from tractor.devx._debug import (
from tractor.devx.debug import (
BoxedMaybeException,
)
from .pytest import (
@ -37,6 +37,9 @@ from .fault_simulation import (
)
# TODO, use dulwhich for this instead?
# -> we're going to likely need it (or something similar)
# for supporting hot-coad reload feats eventually anyway!
def repodir() -> pathlib.Path:
'''
Return the abspath to the repo directory.

View File

@ -0,0 +1,70 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Random IPC addr generation for isolating
the discovery space between test sessions.
Might be eventually useful to expose as a util set from
our `tractor.discovery` subsys?
'''
import random
from typing import (
Type,
)
from tractor import (
_addr,
)
def get_rando_addr(
tpt_proto: str,
*,
# choose random port at import time
_rando_port: str = random.randint(1000, 9999)
) -> tuple[str, str|int]:
'''
Used to globally override the runtime to the
per-test-session-dynamic addr so that all tests never conflict
with any other actor tree using the default.
'''
addr_type: Type[_addr.Addres] = _addr._address_types[tpt_proto]
def_reg_addr: tuple[str, int] = _addr._default_lo_addrs[tpt_proto]
# this is the "unwrapped" form expected to be passed to
# `.open_root_actor()` by test body.
testrun_reg_addr: tuple[str, int|str]
match tpt_proto:
case 'tcp':
testrun_reg_addr = (
addr_type.def_bindspace,
_rando_port,
)
# NOTE, file-name uniqueness (no-collisions) will be based on
# the runtime-directory and root (pytest-proc's) pid.
case 'uds':
testrun_reg_addr = addr_type.get_random().unwrap()
# XXX, as sanity it should never the same as the default for the
# host-singleton registry actor.
assert def_reg_addr != testrun_reg_addr
return testrun_reg_addr

View File

@ -26,29 +26,46 @@ from functools import (
import inspect
import platform
import pytest
import tractor
import trio
def tractor_test(fn):
'''
Decorator for async test funcs to present them as "native"
looking sync funcs runnable by `pytest` using `trio.run()`.
Decorator for async test fns to decorator-wrap them as "native"
looking sync funcs runnable by `pytest` and auto invoked with
`trio.run()` (much like the `pytest-trio` plugin's approach).
Use:
Further the test fn body will be invoked AFTER booting the actor
runtime, i.e. from inside a `tractor.open_root_actor()` block AND
with various runtime and tooling parameters implicitly passed as
requested by by the test session's config; see immediately below.
@tractor_test
async def test_whatever():
await ...
Basic deco use:
---------------
If fixtures:
@tractor_test
async def test_whatever():
await ...
- ``reg_addr`` (a socket addr tuple where arbiter is listening)
- ``loglevel`` (logging level passed to tractor internals)
- ``start_method`` (subprocess spawning backend)
are defined in the `pytest` fixture space they will be automatically
injected to tests declaring these funcargs.
Runtime config via special fixtures:
------------------------------------
If any of the following fixture are requested by the wrapped test
fn (via normal func-args declaration),
- `reg_addr` (a socket addr tuple where arbiter is listening)
- `loglevel` (logging level passed to tractor internals)
- `start_method` (subprocess spawning backend)
(TODO support)
- `tpt_proto` (IPC transport protocol key)
they will be automatically injected to each test as normally
expected as well as passed to the initial
`tractor.open_root_actor()` funcargs.
'''
@wraps(fn)
def wrapper(
@ -111,3 +128,164 @@ def tractor_test(fn):
return trio.run(main)
return wrapper
def pytest_addoption(
parser: pytest.Parser,
):
# parser.addoption(
# "--ll",
# action="store",
# dest='loglevel',
# default='ERROR', help="logging level to set when testing"
# )
parser.addoption(
"--spawn-backend",
action="store",
dest='spawn_backend',
default='trio',
help="Processing spawning backend to use for test run",
)
parser.addoption(
"--tpdb",
"--debug-mode",
action="store_true",
dest='tractor_debug_mode',
# default=False,
help=(
'Enable a flag that can be used by tests to to set the '
'`debug_mode: bool` for engaging the internal '
'multi-proc debugger sys.'
),
)
# provide which IPC transport protocols opting-in test suites
# should accumulatively run against.
parser.addoption(
"--tpt-proto",
nargs='+', # accumulate-multiple-args
action="store",
dest='tpt_protos',
default=['tcp'],
help="Transport protocol to use under the `tractor.ipc.Channel`",
)
def pytest_configure(config):
backend = config.option.spawn_backend
tractor._spawn.try_set_start_method(backend)
@pytest.fixture(scope='session')
def debug_mode(request) -> bool:
'''
Flag state for whether `--tpdb` (for `tractor`-py-debugger)
was passed to the test run.
Normally tests should pass this directly to `.open_root_actor()`
to allow the user to opt into suite-wide crash handling.
'''
debug_mode: bool = request.config.option.tractor_debug_mode
return debug_mode
@pytest.fixture(scope='session')
def spawn_backend(request) -> str:
return request.config.option.spawn_backend
@pytest.fixture(scope='session')
def tpt_protos(request) -> list[str]:
# allow quoting on CLI
proto_keys: list[str] = [
proto_key.replace('"', '').replace("'", "")
for proto_key in request.config.option.tpt_protos
]
# ?TODO, eventually support multiple protos per test-sesh?
if len(proto_keys) > 1:
pytest.fail(
'We only support one `--tpt-proto <key>` atm!\n'
)
# XXX ensure we support the protocol by name via lookup!
for proto_key in proto_keys:
addr_type = tractor._addr._address_types[proto_key]
assert addr_type.proto_key == proto_key
yield proto_keys
@pytest.fixture(
scope='session',
autouse=True,
)
def tpt_proto(
tpt_protos: list[str],
) -> str:
proto_key: str = tpt_protos[0]
from tractor import _state
if _state._def_tpt_proto != proto_key:
_state._def_tpt_proto = proto_key
yield proto_key
@pytest.fixture(scope='session')
def reg_addr(
tpt_proto: str,
) -> tuple[str, int|str]:
'''
Deliver a test-sesh unique registry address such
that each run's (tests which use this fixture) will
have no conflicts/cross-talk when running simultaneously
nor will interfere with other live `tractor` apps active
on the same network-host (namespace).
'''
from tractor._testing.addr import get_rando_addr
return get_rando_addr(
tpt_proto=tpt_proto,
)
def pytest_generate_tests(
metafunc: pytest.Metafunc,
):
spawn_backend: str = metafunc.config.option.spawn_backend
if not spawn_backend:
# XXX some weird windows bug with `pytest`?
spawn_backend = 'trio'
# TODO: maybe just use the literal `._spawn.SpawnMethodKey`?
assert spawn_backend in (
'mp_spawn',
'mp_forkserver',
'trio',
)
# NOTE: used-to-be-used-to dyanmically parametrize tests for when
# you just passed --spawn-backend=`mp` on the cli, but now we expect
# that cli input to be manually specified, BUT, maybe we'll do
# something like this again in the future?
if 'start_method' in metafunc.fixturenames:
metafunc.parametrize(
"start_method",
[spawn_backend],
scope='module',
)
# TODO, parametrize any `tpt_proto: str` declaring tests!
# proto_tpts: list[str] = metafunc.config.option.proto_tpts
# if 'tpt_proto' in metafunc.fixturenames:
# metafunc.parametrize(
# 'tpt_proto',
# proto_tpts, # TODO, double check this list usage!
# scope='module',
# )

View File

@ -0,0 +1,35 @@
import os
import random
def generate_sample_messages(
amount: int,
rand_min: int = 0,
rand_max: int = 0,
silent: bool = False
) -> tuple[list[bytes], int]:
msgs = []
size = 0
if not silent:
print(f'\ngenerating {amount} messages...')
for i in range(amount):
msg = f'[{i:08}]'.encode('utf-8')
if rand_max > 0:
msg += os.urandom(
random.randint(rand_min, rand_max))
size += len(msg)
msgs.append(msg)
if not silent and i and i % 10_000 == 0:
print(f'{i} generated')
if not silent:
print(f'done, {size:,} bytes in total')
return msgs, size

View File

@ -20,7 +20,7 @@ Runtime "developer experience" utils and addons to aid our
and working with/on the actor runtime.
"""
from ._debug import (
from .debug import (
maybe_wait_for_debugger as maybe_wait_for_debugger,
acquire_debug_lock as acquire_debug_lock,
breakpoint as breakpoint,

File diff suppressed because it is too large Load Diff

View File

@ -20,13 +20,18 @@ as it pertains to improving the grok-ability of our runtime!
'''
from __future__ import annotations
from contextlib import (
_GeneratorContextManager,
_AsyncGeneratorContextManager,
)
from functools import partial
import inspect
import textwrap
from types import (
FrameType,
FunctionType,
MethodType,
# CodeType,
CodeType,
)
from typing import (
Any,
@ -34,6 +39,9 @@ from typing import (
Type,
)
import pdbp
from tractor.log import get_logger
import trio
from tractor.msg import (
pretty_struct,
NamespacePath,
@ -41,6 +49,8 @@ from tractor.msg import (
import wrapt
log = get_logger(__name__)
# TODO: yeah, i don't love this and we should prolly just
# write a decorator that actually keeps a stupid ref to the func
# obj..
@ -301,3 +311,70 @@ def api_frame(
# error_set: set[BaseException],
# ) -> TracebackType:
# ...
def hide_runtime_frames() -> dict[FunctionType, CodeType]:
'''
Hide call-stack frames for various std-lib and `trio`-API primitives
such that the tracebacks presented from our runtime are as minimized
as possible, particularly from inside a `PdbREPL`.
'''
# XXX HACKZONE XXX
# hide exit stack frames on nurseries and cancel-scopes!
# |_ so avoid seeing it when the `pdbp` REPL is first engaged from
# inside a `trio.open_nursery()` scope (with no line after it
# in before the block end??).
#
# TODO: FINALLY got this workin originally with
# `@pdbp.hideframe` around the `wrapper()` def embedded inside
# `_ki_protection_decoratior()`.. which is in the module:
# /home/goodboy/.virtualenvs/tractor311/lib/python3.11/site-packages/trio/_core/_ki.py
#
# -[ ] make an issue and patch for `trio` core? maybe linked
# to the long outstanding `pdb` one below?
# |_ it's funny that there's frame hiding throughout `._run.py`
# but not where it matters on the below exit funcs..
#
# -[ ] provide a patchset for the lonstanding
# |_ https://github.com/python-trio/trio/issues/1155
#
# -[ ] make a linked issue to ^ and propose allowing all the
# `._core._run` code to have their `__tracebackhide__` value
# configurable by a `RunVar` to allow getting scheduler frames
# if desired through configuration?
#
# -[ ] maybe dig into the core `pdb` issue why the extra frame is shown
# at all?
#
funcs: list[FunctionType] = [
trio._core._run.NurseryManager.__aexit__,
trio._core._run.CancelScope.__exit__,
_GeneratorContextManager.__exit__,
_AsyncGeneratorContextManager.__aexit__,
_AsyncGeneratorContextManager.__aenter__,
trio.Event.wait,
]
func_list_str: str = textwrap.indent(
"\n".join(f.__qualname__ for f in funcs),
prefix=' |_ ',
)
log.devx(
'Hiding the following runtime frames by default:\n'
f'{func_list_str}\n'
)
codes: dict[FunctionType, CodeType] = {}
for ref in funcs:
# stash a pre-modified version of each ref's code-obj
# so it can be reverted later if needed.
codes[ref] = ref.__code__
pdbp.hideframe(ref)
#
# pdbp.hideframe(trio._core._run.NurseryManager.__aexit__)
# pdbp.hideframe(trio._core._run.CancelScope.__exit__)
# pdbp.hideframe(_GeneratorContextManager.__exit__)
# pdbp.hideframe(_AsyncGeneratorContextManager.__aexit__)
# pdbp.hideframe(_AsyncGeneratorContextManager.__aenter__)
# pdbp.hideframe(trio.Event.wait)
return codes

View File

@ -49,7 +49,7 @@ from tractor import (
_state,
log as logmod,
)
from tractor.devx import _debug
from tractor.devx import debug
log = logmod.get_logger(__name__)
@ -82,7 +82,7 @@ def dump_task_tree() -> None:
if (
current_sigint_handler
is not
_debug.DebugStatus._trio_handler
debug.DebugStatus._trio_handler
):
sigint_handler_report: str = (
'The default `trio` SIGINT handler was replaced?!'
@ -238,7 +238,8 @@ def enable_stack_on_sig(
import stackscope
except ImportError:
log.warning(
'`stackscope` not installed for use in debug mode!'
'The `stackscope` lib is not installed!\n'
'`Ignoring enable_stack_on_sig() call!\n'
)
return None
@ -255,8 +256,8 @@ def enable_stack_on_sig(
dump_tree_on_sig,
)
log.devx(
'Enabling trace-trees on `SIGUSR1` '
'since `stackscope` is installed @ \n'
f'Enabling trace-trees on `SIGUSR1` '
f'since `stackscope` is installed @ \n'
f'{stackscope!r}\n\n'
f'With `SIGUSR1` handler\n'
f'|_{dump_tree_on_sig}\n'

View File

@ -0,0 +1,100 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or
# modify it under the terms of the GNU Affero General Public License
# as published by the Free Software Foundation, either version 3 of
# the License, or (at your option) any later version.
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public
# License along with this program. If not, see
# <https://www.gnu.org/licenses/>.
'''
Multi-actor debugging for da peeps!
'''
from __future__ import annotations
from tractor.log import get_logger
from ._repl import (
PdbREPL as PdbREPL,
mk_pdb as mk_pdb,
TractorConfig as TractorConfig,
)
from ._tty_lock import (
DebugStatus as DebugStatus,
DebugStateError as DebugStateError,
)
from ._trace import (
Lock as Lock,
_pause_msg as _pause_msg,
_repl_fail_msg as _repl_fail_msg,
_set_trace as _set_trace,
_sync_pause_from_builtin as _sync_pause_from_builtin,
breakpoint as breakpoint,
maybe_init_greenback as maybe_init_greenback,
maybe_import_greenback as maybe_import_greenback,
pause as pause,
pause_from_sync as pause_from_sync,
)
from ._post_mortem import (
BoxedMaybeException as BoxedMaybeException,
maybe_open_crash_handler as maybe_open_crash_handler,
open_crash_handler as open_crash_handler,
post_mortem as post_mortem,
_crash_msg as _crash_msg,
_maybe_enter_pm as _maybe_enter_pm,
)
from ._sync import (
maybe_wait_for_debugger as maybe_wait_for_debugger,
acquire_debug_lock as acquire_debug_lock,
)
from ._sigint import (
sigint_shield as sigint_shield,
_ctlc_ignore_header as _ctlc_ignore_header
)
log = get_logger(__name__)
# ----------------
# XXX PKG TODO XXX
# ----------------
# refine the internal impl and APIs!
#
# -[ ] rework `._pause()` and it's branch-cases for root vs.
# subactor:
# -[ ] `._pause_from_root()` + `_pause_from_subactor()`?
# -[ ] do the de-factor based on bg-thread usage in
# `.pause_from_sync()` & `_pause_from_bg_root_thread()`.
# -[ ] drop `debug_func == None` case which is confusing af..
# -[ ] factor out `_enter_repl_sync()` into a util func for calling
# the `_set_trace()` / `_post_mortem()` APIs?
#
# -[ ] figure out if we need `acquire_debug_lock()` and/or re-implement
# it as part of the `.pause_from_sync()` rework per above?
#
# -[ ] pair the `._pause_from_subactor()` impl with a "debug nursery"
# that's dynamically allocated inside the `._rpc` task thus
# avoiding the `._service_n.start()` usage for the IPC request?
# -[ ] see the TODO inside `._rpc._errors_relayed_via_ipc()`
#
# -[ ] impl a `open_debug_request()` which encaps all
# `request_root_stdio_lock()` task scheduling deats
# + `DebugStatus` state mgmt; which should prolly be re-branded as
# a `DebugRequest` type anyway AND with suppoort for bg-thread
# (from root actor) usage?
#
# -[ ] handle the `xonsh` case for bg-root-threads in the SIGINT
# handler!
# -[ ] do we need to do the same for subactors?
# -[ ] make the failing tests finally pass XD
#
# -[ ] simplify `maybe_wait_for_debugger()` to be a root-task only
# API?
# -[ ] currently it's implemented as that so might as well make it
# formal?

View File

@ -0,0 +1,412 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or
# modify it under the terms of the GNU Affero General Public License
# as published by the Free Software Foundation, either version 3 of
# the License, or (at your option) any later version.
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public
# License along with this program. If not, see
# <https://www.gnu.org/licenses/>.
'''
Post-mortem debugging APIs and surrounding machinery for both
sync and async contexts.
Generally we maintain the same semantics a `pdb.post.mortem()` but
with actor-tree-wide sync/cooperation around any (sub)actor's use of
the root's TTY.
'''
from __future__ import annotations
import bdb
from contextlib import (
AbstractContextManager,
contextmanager as cm,
nullcontext,
)
from functools import (
partial,
)
import inspect
import sys
import traceback
from typing import (
Callable,
Sequence,
Type,
TYPE_CHECKING,
)
from types import (
TracebackType,
FrameType,
)
from msgspec import Struct
import trio
from tractor._exceptions import (
NoRuntime,
)
from tractor import _state
from tractor._state import (
current_actor,
debug_mode,
)
from tractor.log import get_logger
from tractor.trionics import (
is_multi_cancelled,
)
from ._trace import (
_pause,
)
from ._tty_lock import (
DebugStatus,
)
from ._repl import (
PdbREPL,
mk_pdb,
TractorConfig as TractorConfig,
)
if TYPE_CHECKING:
from trio.lowlevel import Task
from tractor._runtime import (
Actor,
)
_crash_msg: str = (
'Opening a pdb REPL in crashed actor'
)
log = get_logger(__package__)
class BoxedMaybeException(Struct):
'''
Box a maybe-exception for post-crash introspection usage
from the body of a `open_crash_handler()` scope.
'''
value: BaseException|None = None
# handler can suppress crashes dynamically
raise_on_exit: bool|Sequence[Type[BaseException]] = True
def pformat(self) -> str:
'''
Repr the boxed `.value` error in more-than-string
repr form.
'''
if not self.value:
return f'<{type(self).__name__}( .value=None )>'
return (
f'<{type(self.value).__name__}(\n'
f' |_.value = {self.value}\n'
f')>\n'
)
__repr__ = pformat
def _post_mortem(
repl: PdbREPL, # normally passed by `_pause()`
# XXX all `partial`-ed in by `post_mortem()` below!
tb: TracebackType,
api_frame: FrameType,
shield: bool = False,
hide_tb: bool = True,
# maybe pre/post REPL entry
repl_fixture: (
AbstractContextManager[bool]
|None
) = None,
boxed_maybe_exc: BoxedMaybeException|None = None,
) -> None:
'''
Enter the ``pdbpp`` port mortem entrypoint using our custom
debugger instance.
'''
__tracebackhide__: bool = hide_tb
# maybe enter any user fixture
enter_repl: bool = DebugStatus.maybe_enter_repl_fixture(
repl=repl,
repl_fixture=repl_fixture,
boxed_maybe_exc=boxed_maybe_exc,
)
try:
if not enter_repl:
# XXX, trigger `.release()` below immediately!
return
try:
actor: Actor = current_actor()
actor_repr: str = str(actor.uid)
# ^TODO, instead a nice runtime-info + maddr + uid?
# -[ ] impl a `Actor.__repr()__`??
# |_ <task>:<thread> @ <actor>
except NoRuntime:
actor_repr: str = '<no-actor-runtime?>'
try:
task_repr: Task = trio.lowlevel.current_task()
except RuntimeError:
task_repr: str = '<unknown-Task>'
# TODO: print the actor supervion tree up to the root
# here! Bo
log.pdb(
f'{_crash_msg}\n'
f'x>(\n'
f' |_ {task_repr} @ {actor_repr}\n'
)
# XXX NOTE(s) on `pdbp.xpm()` version..
#
# - seems to lose the up-stack tb-info?
# - currently we're (only) replacing this from `pdbp.xpm()`
# to add the `end=''` to the print XD
#
print(traceback.format_exc(), end='')
caller_frame: FrameType = api_frame.f_back
# NOTE, see the impl details of these in the lib to
# understand usage:
# - `pdbp.post_mortem()`
# - `pdbp.xps()`
# - `bdb.interaction()`
repl.reset()
repl.interaction(
frame=caller_frame,
# frame=None,
traceback=tb,
)
finally:
# XXX NOTE XXX: this is abs required to avoid hangs!
#
# Since we presume the post-mortem was enaged to
# a task-ending error, we MUST release the local REPL request
# so that not other local task nor the root remains blocked!
DebugStatus.release()
async def post_mortem(
*,
tb: TracebackType|None = None,
api_frame: FrameType|None = None,
hide_tb: bool = False,
# TODO: support shield here just like in `pause()`?
# shield: bool = False,
**_pause_kwargs,
) -> None:
'''
Our builtin async equivalient of `pdb.post_mortem()` which can be
used inside exception handlers.
It's also used for the crash handler when `debug_mode == True` ;)
'''
__tracebackhide__: bool = hide_tb
tb: TracebackType = tb or sys.exc_info()[2]
# TODO: do upward stack scan for highest @api_frame and
# use its parent frame as the expected user-app code
# interact point.
api_frame: FrameType = api_frame or inspect.currentframe()
# TODO, move to submod `._pausing` or ._api? _trace
await _pause(
debug_func=partial(
_post_mortem,
api_frame=api_frame,
tb=tb,
),
hide_tb=hide_tb,
**_pause_kwargs
)
async def _maybe_enter_pm(
err: BaseException,
*,
tb: TracebackType|None = None,
api_frame: FrameType|None = None,
hide_tb: bool = True,
# only enter debugger REPL when returns `True`
debug_filter: Callable[
[BaseException|BaseExceptionGroup],
bool,
] = lambda err: not is_multi_cancelled(err),
**_pause_kws,
):
if (
debug_mode()
# NOTE: don't enter debug mode recursively after quitting pdb
# Iow, don't re-enter the repl if the `quit` command was issued
# by the user.
and not isinstance(err, bdb.BdbQuit)
# XXX: if the error is the likely result of runtime-wide
# cancellation, we don't want to enter the debugger since
# there's races between when the parent actor has killed all
# comms and when the child tries to contact said parent to
# acquire the tty lock.
# Really we just want to mostly avoid catching KBIs here so there
# might be a simpler check we can do?
and
debug_filter(err)
):
api_frame: FrameType = api_frame or inspect.currentframe()
tb: TracebackType = tb or sys.exc_info()[2]
await post_mortem(
api_frame=api_frame,
tb=tb,
**_pause_kws,
)
return True
else:
return False
# TODO: better naming and what additionals?
# - [ ] optional runtime plugging?
# - [ ] detection for sync vs. async code?
# - [ ] specialized REPL entry when in distributed mode?
# -[x] hide tb by def
# - [x] allow ignoring kbi Bo
@cm
def open_crash_handler(
catch: set[BaseException] = {
BaseException,
},
ignore: set[BaseException] = {
KeyboardInterrupt,
trio.Cancelled,
},
hide_tb: bool = True,
repl_fixture: (
AbstractContextManager[bool] # pre/post REPL entry
|None
) = None,
raise_on_exit: bool|Sequence[Type[BaseException]] = True,
):
'''
Generic "post mortem" crash handler using `pdbp` REPL debugger.
We expose this as a CLI framework addon to both `click` and
`typer` users so they can quickly wrap cmd endpoints which get
automatically wrapped to use the runtime's `debug_mode: bool`
AND `pdbp.pm()` around any code that is PRE-runtime entry
- any sync code which runs BEFORE the main call to
`trio.run()`.
'''
__tracebackhide__: bool = hide_tb
# TODO, yield a `outcome.Error`-like boxed type?
# -[~] use `outcome.Value/Error` X-> frozen!
# -[x] write our own..?
# -[ ] consider just wtv is used by `pytest.raises()`?
#
boxed_maybe_exc = BoxedMaybeException(
raise_on_exit=raise_on_exit,
)
err: BaseException
try:
yield boxed_maybe_exc
except tuple(catch) as err:
boxed_maybe_exc.value = err
if (
type(err) not in ignore
and
not is_multi_cancelled(
err,
ignore_nested=ignore
)
):
try:
# use our re-impl-ed version of `pdbp.xpm()`
_post_mortem(
repl=mk_pdb(),
tb=sys.exc_info()[2],
api_frame=inspect.currentframe().f_back,
hide_tb=hide_tb,
repl_fixture=repl_fixture,
boxed_maybe_exc=boxed_maybe_exc,
)
except bdb.BdbQuit:
__tracebackhide__: bool = False
raise err
if (
raise_on_exit is True
or (
raise_on_exit is not False
and (
set(raise_on_exit)
and
type(err) in raise_on_exit
)
)
and
boxed_maybe_exc.raise_on_exit == raise_on_exit
):
raise err
@cm
def maybe_open_crash_handler(
pdb: bool|None = None,
hide_tb: bool = True,
**kwargs,
):
'''
Same as `open_crash_handler()` but with bool input flag
to allow conditional handling.
Normally this is used with CLI endpoints such that if the --pdb
flag is passed the pdb REPL is engaed on any crashes B)
'''
__tracebackhide__: bool = hide_tb
if pdb is None:
pdb: bool = _state.is_debug_mode()
rtctx = nullcontext(
enter_result=BoxedMaybeException()
)
if pdb:
rtctx = open_crash_handler(
hide_tb=hide_tb,
**kwargs,
)
with rtctx as boxed_maybe_exc:
yield boxed_maybe_exc

View File

@ -0,0 +1,207 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or
# modify it under the terms of the GNU Affero General Public License
# as published by the Free Software Foundation, either version 3 of
# the License, or (at your option) any later version.
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public
# License along with this program. If not, see
# <https://www.gnu.org/licenses/>.
'''
`pdpp.Pdb` extentions/customization and other delegate usage.
'''
from functools import (
cached_property,
)
import os
import pdbp
from tractor._state import (
is_root_process,
)
from ._tty_lock import (
Lock,
DebugStatus,
)
class TractorConfig(pdbp.DefaultConfig):
'''
Custom `pdbp` config which tries to use the best tradeoff
between pretty and minimal.
'''
use_pygments: bool = True
sticky_by_default: bool = False
enable_hidden_frames: bool = True
# much thanks @mdmintz for the hot tip!
# fixes line spacing issue when resizing terminal B)
truncate_long_lines: bool = False
# ------ - ------
# our own custom config vars mostly
# for syncing with the actor tree's singleton
# TTY `Lock`.
class PdbREPL(pdbp.Pdb):
'''
Add teardown hooks and local state describing any
ongoing TTY `Lock` request dialog.
'''
# override the pdbp config with our coolio one
# NOTE: this is only loaded when no `~/.pdbrc` exists
# so we should prolly pass it into the .__init__() instead?
# i dunno, see the `DefaultFactory` and `pdb.Pdb` impls.
DefaultConfig = TractorConfig
status = DebugStatus
# NOTE: see details in stdlib's `bdb.py`
# def user_exception(self, frame, exc_info):
# '''
# Called when we stop on an exception.
# '''
# log.warning(
# 'Exception during REPL sesh\n\n'
# f'{frame}\n\n'
# f'{exc_info}\n\n'
# )
# NOTE: this actually hooks but i don't see anyway to detect
# if an error was caught.. this is why currently we just always
# call `DebugStatus.release` inside `_post_mortem()`.
# def preloop(self):
# print('IN PRELOOP')
# super().preloop()
# TODO: cleaner re-wrapping of all this?
# -[ ] figure out how to disallow recursive .set_trace() entry
# since that'll cause deadlock for us.
# -[ ] maybe a `@cm` to call `super().<same_meth_name>()`?
# -[ ] look at hooking into the `pp` hook specially with our
# own set of pretty-printers?
# * `.pretty_struct.Struct.pformat()`
# * `.pformat(MsgType.pld)`
# * `.pformat(Error.tb_str)`?
# * .. maybe more?
#
def set_continue(self):
try:
super().set_continue()
finally:
# NOTE: for subactors the stdio lock is released via the
# allocated RPC locker task, so for root we have to do it
# manually.
if (
is_root_process()
and
Lock._debug_lock.locked()
and
DebugStatus.is_main_trio_thread()
):
# Lock.release(raise_on_thread=False)
Lock.release()
# XXX AFTER `Lock.release()` for root local repl usage
DebugStatus.release()
def set_quit(self):
try:
super().set_quit()
finally:
if (
is_root_process()
and
Lock._debug_lock.locked()
and
DebugStatus.is_main_trio_thread()
):
# Lock.release(raise_on_thread=False)
Lock.release()
# XXX after `Lock.release()` for root local repl usage
DebugStatus.release()
# XXX NOTE: we only override this because apparently the stdlib pdb
# bois likes to touch the SIGINT handler as much as i like to touch
# my d$%&.
def _cmdloop(self):
self.cmdloop()
@cached_property
def shname(self) -> str | None:
'''
Attempt to return the login shell name with a special check for
the infamous `xonsh` since it seems to have some issues much
different from std shells when it comes to flushing the prompt?
'''
# SUPER HACKY and only really works if `xonsh` is not used
# before spawning further sub-shells..
shpath = os.getenv('SHELL', None)
if shpath:
if (
os.getenv('XONSH_LOGIN', default=False)
or 'xonsh' in shpath
):
return 'xonsh'
return os.path.basename(shpath)
return None
def mk_pdb() -> PdbREPL:
'''
Deliver a new `PdbREPL`: a multi-process safe `pdbp.Pdb`-variant
using the magic of `tractor`'s SC-safe IPC.
B)
Our `pdb.Pdb` subtype accomplishes multi-process safe debugging
by:
- mutexing access to the root process' std-streams (& thus parent
process TTY) via an IPC managed `Lock` singleton per
actor-process tree.
- temporarily overriding any subactor's SIGINT handler to shield
during live REPL sessions in sub-actors such that cancellation
is never (mistakenly) triggered by a ctrl-c and instead only by
explicit runtime API requests or after the
`pdb.Pdb.interaction()` call has returned.
FURTHER, the `pdbp.Pdb` instance is configured to be `trio`
"compatible" from a SIGINT handling perspective; we mask out
the default `pdb` handler and instead apply `trio`s default
which mostly addresses all issues described in:
- https://github.com/python-trio/trio/issues/1155
The instance returned from this factory should always be
preferred over the default `pdb[p].set_trace()` whenever using
a `pdb` REPL inside a `trio` based runtime.
'''
pdb = PdbREPL()
# XXX: These are the important flags mentioned in
# https://github.com/python-trio/trio/issues/1155
# which resolve the traceback spews to console.
pdb.allow_kbdint = True
pdb.nosigint = True
return pdb

View File

@ -0,0 +1,333 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or
# modify it under the terms of the GNU Affero General Public License
# as published by the Free Software Foundation, either version 3 of
# the License, or (at your option) any later version.
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public
# License along with this program. If not, see
# <https://www.gnu.org/licenses/>.
'''
A custom SIGINT handler which mainly shields actor (task)
cancellation during REPL interaction.
'''
from __future__ import annotations
from typing import (
TYPE_CHECKING,
)
import trio
from tractor.log import get_logger
from tractor._state import (
current_actor,
is_root_process,
)
from ._repl import (
PdbREPL,
)
from ._tty_lock import (
any_connected_locker_child,
DebugStatus,
Lock,
)
if TYPE_CHECKING:
from tractor.ipc import (
Channel,
)
from tractor._runtime import (
Actor,
)
log = get_logger(__name__)
_ctlc_ignore_header: str = (
'Ignoring SIGINT while debug REPL in use'
)
def sigint_shield(
signum: int,
frame: 'frame', # type: ignore # noqa
*args,
) -> None:
'''
Specialized, debugger-aware SIGINT handler.
In childred we always ignore/shield for SIGINT to avoid
deadlocks since cancellation should always be managed by the
supervising parent actor. The root actor-proces is always
cancelled on ctrl-c.
'''
__tracebackhide__: bool = True
actor: Actor = current_actor()
def do_cancel():
# If we haven't tried to cancel the runtime then do that instead
# of raising a KBI (which may non-gracefully destroy
# a ``trio.run()``).
if not actor._cancel_called:
actor.cancel_soon()
# If the runtime is already cancelled it likely means the user
# hit ctrl-c again because teardown didn't fully take place in
# which case we do the "hard" raising of a local KBI.
else:
raise KeyboardInterrupt
# only set in the actor actually running the REPL
repl: PdbREPL|None = DebugStatus.repl
# TODO: maybe we should flatten out all these cases using
# a match/case?
#
# root actor branch that reports whether or not a child
# has locked debugger.
if is_root_process():
# log.warning(
log.devx(
'Handling SIGINT in root actor\n'
f'{Lock.repr()}'
f'{DebugStatus.repr()}\n'
)
# try to see if the supposed (sub)actor in debug still
# has an active connection to *this* actor, and if not
# it's likely they aren't using the TTY lock / debugger
# and we should propagate SIGINT normally.
any_connected: bool = any_connected_locker_child()
problem = (
f'root {actor.uid} handling SIGINT\n'
f'any_connected: {any_connected}\n\n'
f'{Lock.repr()}\n'
)
if (
(ctx := Lock.ctx_in_debug)
and
(uid_in_debug := ctx.chan.uid) # "someone" is (ostensibly) using debug `Lock`
):
name_in_debug: str = uid_in_debug[0]
assert not repl
# if not repl: # but it's NOT us, the root actor.
# sanity: since no repl ref is set, we def shouldn't
# be the lock owner!
assert name_in_debug != 'root'
# IDEAL CASE: child has REPL as expected
if any_connected: # there are subactors we can contact
# XXX: only if there is an existing connection to the
# (sub-)actor in debug do we ignore SIGINT in this
# parent! Otherwise we may hang waiting for an actor
# which has already terminated to unlock.
#
# NOTE: don't emit this with `.pdb()` level in
# root without a higher level.
log.runtime(
_ctlc_ignore_header
+
f' by child '
f'{uid_in_debug}\n'
)
problem = None
else:
problem += (
'\n'
f'A `pdb` REPL is SUPPOSEDLY in use by child {uid_in_debug}\n'
f'BUT, no child actors are IPC contactable!?!?\n'
)
# IDEAL CASE: root has REPL as expected
else:
# root actor still has this SIGINT handler active without
# an actor using the `Lock` (a bug state) ??
# => so immediately cancel any stale lock cs and revert
# the handler!
if not DebugStatus.repl:
# TODO: WHEN should we revert back to ``trio``
# handler if this one is stale?
# -[ ] maybe after a counts work of ctl-c mashes?
# -[ ] use a state var like `stale_handler: bool`?
problem += (
'No subactor is using a `pdb` REPL according `Lock.ctx_in_debug`?\n'
'BUT, the root should be using it, WHY this handler ??\n\n'
'So either..\n'
'- some root-thread is using it but has no `.repl` set?, OR\n'
'- something else weird is going on outside the runtime!?\n'
)
else:
# NOTE: since we emit this msg on ctl-c, we should
# also always re-print the prompt the tail block!
log.pdb(
_ctlc_ignore_header
+
f' by root actor..\n'
f'{DebugStatus.repl_task}\n'
f' |_{repl}\n'
)
problem = None
# XXX if one is set it means we ARE NOT operating an ideal
# case where a child subactor or us (the root) has the
# lock without any other detected problems.
if problem:
# detect, report and maybe clear a stale lock request
# cancel scope.
lock_cs: trio.CancelScope = Lock.get_locking_task_cs()
maybe_stale_lock_cs: bool = (
lock_cs is not None
and not lock_cs.cancel_called
)
if maybe_stale_lock_cs:
problem += (
'\n'
'Stale `Lock.ctx_in_debug._scope: CancelScope` detected?\n'
f'{Lock.ctx_in_debug}\n\n'
'-> Calling ctx._scope.cancel()!\n'
)
lock_cs.cancel()
# TODO: wen do we actually want/need this, see above.
# DebugStatus.unshield_sigint()
log.warning(problem)
# child actor that has locked the debugger
elif not is_root_process():
log.debug(
f'Subactor {actor.uid} handling SIGINT\n\n'
f'{Lock.repr()}\n'
)
rent_chan: Channel = actor._parent_chan
if (
rent_chan is None
or
not rent_chan.connected()
):
log.warning(
'This sub-actor thinks it is debugging '
'but it has no connection to its parent ??\n'
f'{actor.uid}\n'
'Allowing SIGINT propagation..'
)
DebugStatus.unshield_sigint()
repl_task: str|None = DebugStatus.repl_task
req_task: str|None = DebugStatus.req_task
if (
repl_task
and
repl
):
log.pdb(
_ctlc_ignore_header
+
f' by local task\n\n'
f'{repl_task}\n'
f' |_{repl}\n'
)
elif req_task:
log.debug(
_ctlc_ignore_header
+
f' by local request-task and either,\n'
f'- someone else is already REPL-in and has the `Lock`, or\n'
f'- some other local task already is replin?\n\n'
f'{req_task}\n'
)
# TODO can we remove this now?
# -[ ] does this path ever get hit any more?
else:
msg: str = (
'SIGINT shield handler still active BUT, \n\n'
)
if repl_task is None:
msg += (
'- No local task claims to be in debug?\n'
)
if repl is None:
msg += (
'- No local REPL is currently active?\n'
)
if req_task is None:
msg += (
'- No debug request task is active?\n'
)
log.warning(
msg
+
'Reverting handler to `trio` default!\n'
)
DebugStatus.unshield_sigint()
# XXX ensure that the reverted-to-handler actually is
# able to rx what should have been **this** KBI ;)
do_cancel()
# TODO: how to handle the case of an intermediary-child actor
# that **is not** marked in debug mode? See oustanding issue:
# https://github.com/goodboy/tractor/issues/320
# elif debug_mode():
# maybe redraw/print last REPL output to console since
# we want to alert the user that more input is expect since
# nothing has been done dur to ignoring sigint.
if (
DebugStatus.repl # only when current actor has a REPL engaged
):
flush_status: str = (
'Flushing stdout to ensure new prompt line!\n'
)
# XXX: yah, mega hack, but how else do we catch this madness XD
if (
repl.shname == 'xonsh'
):
flush_status += (
'-> ALSO re-flushing due to `xonsh`..\n'
)
repl.stdout.write(repl.prompt)
# log.warning(
log.devx(
flush_status
)
repl.stdout.flush()
# TODO: better console UX to match the current "mode":
# -[ ] for example if in sticky mode where if there is output
# detected as written to the tty we redraw this part underneath
# and erase the past draw of this same bit above?
# repl.sticky = True
# repl._print_if_sticky()
# also see these links for an approach from `ptk`:
# https://github.com/goodboy/tractor/issues/130#issuecomment-663752040
# https://github.com/prompt-toolkit/python-prompt-toolkit/blob/c2c6af8a0308f9e5d7c0e28cb8a02963fe0ce07a/prompt_toolkit/patch_stdout.py
else:
log.devx(
# log.warning(
'Not flushing stdout since not needed?\n'
f'|_{repl}\n'
)
# XXX only for tracing this handler
log.devx('exiting SIGINT')

View File

@ -0,0 +1,220 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or
# modify it under the terms of the GNU Affero General Public License
# as published by the Free Software Foundation, either version 3 of
# the License, or (at your option) any later version.
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public
# License along with this program. If not, see
# <https://www.gnu.org/licenses/>.
'''
Debugger synchronization APIs to ensure orderly access and
non-TTY-clobbering graceful teardown.
'''
from __future__ import annotations
from contextlib import (
asynccontextmanager as acm,
)
from functools import (
partial,
)
from typing import (
AsyncGenerator,
Callable,
)
from tractor.log import get_logger
import trio
from trio.lowlevel import (
current_task,
Task,
)
from tractor._context import Context
from tractor._state import (
current_actor,
debug_mode,
is_root_process,
)
from ._repl import (
TractorConfig as TractorConfig,
)
from ._tty_lock import (
Lock,
request_root_stdio_lock,
any_connected_locker_child,
)
from ._sigint import (
sigint_shield as sigint_shield,
_ctlc_ignore_header as _ctlc_ignore_header
)
log = get_logger(__package__)
async def maybe_wait_for_debugger(
poll_steps: int = 2,
poll_delay: float = 0.1,
child_in_debug: bool = False,
header_msg: str = '',
_ll: str = 'devx',
) -> bool: # was locked and we polled?
if (
not debug_mode()
and
not child_in_debug
):
return False
logmeth: Callable = getattr(log, _ll)
msg: str = header_msg
if (
is_root_process()
):
# If we error in the root but the debugger is
# engaged we don't want to prematurely kill (and
# thus clobber access to) the local tty since it
# will make the pdb repl unusable.
# Instead try to wait for pdb to be released before
# tearing down.
ctx_in_debug: Context|None = Lock.ctx_in_debug
in_debug: tuple[str, str]|None = (
ctx_in_debug.chan.uid
if ctx_in_debug
else None
)
if in_debug == current_actor().uid:
log.debug(
msg
+
'Root already owns the TTY LOCK'
)
return True
elif in_debug:
msg += (
f'Debug `Lock` in use by subactor\n|\n|_{in_debug}\n'
)
# TODO: could this make things more deterministic?
# wait to see if a sub-actor task will be
# scheduled and grab the tty lock on the next
# tick?
# XXX => but it doesn't seem to work..
# await trio.testing.wait_all_tasks_blocked(cushion=0)
else:
logmeth(
msg
+
'Root immediately acquired debug TTY LOCK'
)
return False
for istep in range(poll_steps):
if (
Lock.req_handler_finished is not None
and not Lock.req_handler_finished.is_set()
and in_debug is not None
):
# caller_frame_info: str = pformat_caller_frame()
logmeth(
msg
+
'\n^^ Root is waiting on tty lock release.. ^^\n'
# f'{caller_frame_info}\n'
)
if not any_connected_locker_child():
Lock.get_locking_task_cs().cancel()
with trio.CancelScope(shield=True):
await Lock.req_handler_finished.wait()
log.devx(
f'Subactor released debug lock\n'
f'|_{in_debug}\n'
)
break
# is no subactor locking debugger currently?
if (
in_debug is None
and (
Lock.req_handler_finished is None
or Lock.req_handler_finished.is_set()
)
):
logmeth(
msg
+
'Root acquired tty lock!'
)
break
else:
logmeth(
'Root polling for debug:\n'
f'poll step: {istep}\n'
f'poll delya: {poll_delay}\n\n'
f'{Lock.repr()}\n'
)
with trio.CancelScope(shield=True):
await trio.sleep(poll_delay)
continue
return True
# else:
# # TODO: non-root call for #320?
# this_uid: tuple[str, str] = current_actor().uid
# async with acquire_debug_lock(
# subactor_uid=this_uid,
# ):
# pass
return False
@acm
async def acquire_debug_lock(
subactor_uid: tuple[str, str],
) -> AsyncGenerator[
trio.CancelScope|None,
tuple,
]:
'''
Request to acquire the TTY `Lock` in the root actor, release on
exit.
This helper is for actor's who don't actually need to acquired
the debugger but want to wait until the lock is free in the
process-tree root such that they don't clobber an ongoing pdb
REPL session in some peer or child!
'''
if not debug_mode():
yield None
return
task: Task = current_task()
async with trio.open_nursery() as n:
ctx: Context = await n.start(
partial(
request_root_stdio_lock,
actor_uid=subactor_uid,
task_uid=(task.name, id(task)),
)
)
yield ctx
ctx.cancel()

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -15,10 +15,13 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Pretty formatters for use throughout the code base.
Mostly handy for logging and exception message content.
Pretty formatters for use throughout our internals.
Handy for logging and exception message content but also for `repr()`
in REPL(s).
'''
import sys
import textwrap
import traceback
@ -115,6 +118,85 @@ def pformat_boxed_tb(
)
def pformat_exc(
exc: Exception,
header: str = '',
message: str = '',
body: str = '',
with_type_header: bool = True,
) -> str:
# XXX when the currently raised exception is this instance,
# we do not ever use the "type header" style repr.
is_being_raised: bool = False
if (
(curr_exc := sys.exception())
and
curr_exc is exc
):
is_being_raised: bool = True
with_type_header: bool = (
with_type_header
and
not is_being_raised
)
# <RemoteActorError( .. )> style
if (
with_type_header
and
not header
):
header: str = f'<{type(exc).__name__}('
message: str = (
message
or
exc.message
)
if message:
# split off the first line so, if needed, it isn't
# indented the same like the "boxed content" which
# since there is no `.tb_str` is just the `.message`.
lines: list[str] = message.splitlines()
first: str = lines[0]
message: str = message.removeprefix(first)
# with a type-style header we,
# - have no special message "first line" extraction/handling
# - place the message a space in from the header:
# `MsgTypeError( <message> ..`
# ^-here
# - indent the `.message` inside the type body.
if with_type_header:
first = f' {first} )>'
message: str = textwrap.indent(
message,
prefix=' '*2,
)
message: str = first + message
tail: str = ''
if (
with_type_header
and
not message
):
tail: str = '>'
return (
header
+
message
+
f'{body}'
+
tail
)
def pformat_caller_frame(
stack_limit: int = 1,
box_tb: bool = True,
@ -144,8 +226,8 @@ def pformat_cs(
field_prefix: str = ' |_',
) -> str:
'''
Pretty format info about a `trio.CancelScope` including most
of its public state and `._cancel_status`.
Pretty format info about a `trio.CancelScope` including most of
its public state and `._cancel_status`.
The output can be modified to show a "var name" for the
instance as a field prefix, just a simple str before each
@ -167,3 +249,279 @@ def pformat_cs(
+
fields
)
def nest_from_op(
input_op: str, # TODO, Literal of all op-"symbols" from below?
text: str,
prefix_op: bool = True, # unset is to suffix the first line
# optionally suffix `text`, by def on a newline
op_suffix='\n',
nest_prefix: str = '|_',
nest_indent: int|None = None,
# XXX indent `next_prefix` "to-the-right-of" `input_op`
# by this count of whitespaces (' ').
rm_from_first_ln: str|None = None,
) -> str:
'''
Depth-increment the input (presumably hierarchy/supervision)
input "tree string" below the provided `input_op` execution
operator, so injecting a `"\n|_{input_op}\n"`and indenting the
`tree_str` to nest content aligned with the ops last char.
'''
# `sclang` "structurred-concurrency-language": an ascii-encoded
# symbolic alphabet to describe concurrent systems.
#
# ?TODO? aa more fomal idea for a syntax to the state of
# concurrent systems as a "3-domain" (execution, scope, storage)
# model and using a minimal ascii/utf-8 operator-set.
#
# try not to take any of this seriously yet XD
#
# > is a "play operator" indicating (CPU bound)
# exec/work/ops required at the "lowest level computing"
#
# execution primititves (tasks, threads, actors..) denote their
# lifetime with '(' and ')' since parentheses normally are used
# in many langs to denote function calls.
#
# starting = (
# >( opening/starting; beginning of the thread-of-exec (toe?)
# (> opened/started, (finished spawning toe)
# |_<Task: blah blah..> repr of toe, in py these look like <objs>
#
# >) closing/exiting/stopping,
# )> closed/exited/stopped,
# |_<Task: blah blah..>
# [OR <), )< ?? ]
#
# ending = )
# >c) cancelling to close/exit
# c)> cancelled (caused close), OR?
# |_<Actor: ..>
# OR maybe "<c)" which better indicates the cancel being
# "delivered/returned" / returned" to LHS?
#
# >x) erroring to eventuall exit
# x)> errored and terminated
# |_<Actor: ...>
#
# scopes: supers/nurseries, IPC-ctxs, sessions, perms, etc.
# >{ opening
# {> opened
# }> closed
# >} closing
#
# storage: like queues, shm-buffers, files, etc..
# >[ opening
# [> opened
# |_<FileObj: ..>
#
# >] closing
# ]> closed
# IPC ops: channels, transports, msging
# => req msg
# <= resp msg
# <=> 2-way streaming (of msgs)
# <- recv 1 msg
# -> send 1 msg
#
# TODO: still not sure on R/L-HS approach..?
# =>( send-req to exec start (task, actor, thread..)
# (<= recv-req to ^
#
# (<= recv-req ^
# <=( recv-resp opened remote exec primitive
# <=) recv-resp closed
#
# )<=c req to stop due to cancel
# c=>) req to stop due to cancel
#
# =>{ recv-req to open
# <={ send-status that it closed
#
if (
nest_prefix
and
nest_indent != 0
):
if nest_indent is not None:
nest_prefix: str = textwrap.indent(
nest_prefix,
prefix=nest_indent*' ',
)
nest_indent: int = len(nest_prefix)
# determine body-text indent either by,
# - using wtv explicit indent value is provided,
# OR
# - auto-calcing the indent to embed `text` under
# the `nest_prefix` if provided, **IFF** `nest_indent=None`.
tree_str_indent: int = 0
if nest_indent not in {0, None}:
tree_str_indent = nest_indent
elif (
nest_prefix
and
nest_indent != 0
):
tree_str_indent = len(nest_prefix)
indented_tree_str: str = text
if tree_str_indent:
indented_tree_str: str = textwrap.indent(
text,
prefix=' '*tree_str_indent,
)
# inject any provided nesting-prefix chars
# into the head of the first line.
if nest_prefix:
indented_tree_str: str = (
f'{nest_prefix}{indented_tree_str[tree_str_indent:]}'
)
if (
not prefix_op
or
rm_from_first_ln
):
tree_lns: list[str] = indented_tree_str.splitlines()
first: str = tree_lns[0]
if rm_from_first_ln:
first = first.strip().replace(
rm_from_first_ln,
'',
)
indented_tree_str: str = '\n'.join(tree_lns[1:])
if prefix_op:
indented_tree_str = (
f'{first}\n'
f'{indented_tree_str}'
)
if prefix_op:
return (
f'{input_op}{op_suffix}'
f'{indented_tree_str}'
)
else:
return (
f'{first}{input_op}{op_suffix}'
f'{indented_tree_str}'
)
# ------ modden.repr ------
# XXX originally taken verbaatim from `modden.repr`
'''
More "multi-line" representation then the stdlib's `pprint` equivs.
'''
from inspect import (
FrameInfo,
stack,
)
import pprint
import reprlib
from typing import (
Callable,
)
def mk_repr(
**repr_kws,
) -> Callable[[str], str]:
'''
Allocate and deliver a `repr.Repr` instance with provided input
settings using the std-lib's `reprlib` mod,
* https://docs.python.org/3/library/reprlib.html
------ Ex. ------
An up to 6-layer-nested `dict` as multi-line:
- https://stackoverflow.com/a/79102479
- https://docs.python.org/3/library/reprlib.html#reprlib.Repr.maxlevel
'''
def_kws: dict[str, int] = dict(
indent=3, # indent used for repr of recursive objects
maxlevel=616, # recursion levels
maxdict=616, # max items shown for `dict`
maxlist=616, # max items shown for `dict`
maxstring=616, # match editor line-len limit
maxtuple=616, # match editor line-len limit
maxother=616, # match editor line-len limit
)
def_kws |= repr_kws
reprr = reprlib.Repr(**def_kws)
return reprr.repr
def ppfmt(
obj: object,
do_print: bool = False,
) -> str:
'''
The `pprint.pformat()` version of `pprint.pp()`, namely
a default `sort_dicts=False`.. (which i think should be
the normal default in the stdlib).
'''
pprepr: Callable = mk_repr()
repr_str: str = pprepr(obj)
if do_print:
return pprint.pp(repr_str)
return repr_str
pformat = ppfmt
def pfmt_frame_info(fi: FrameInfo) -> str:
'''
Like a std `inspect.FrameInfo.__repr__()` but multi-line..
'''
return (
'FrameInfo(\n'
' frame={!r},\n'
' filename={!r},\n'
' lineno={!r},\n'
' function={!r},\n'
' code_context={!r},\n'
' index={!r},\n'
' positions={!r})'
).format(
fi.frame,
fi.filename,
fi.lineno,
fi.function,
fi.code_context,
fi.index,
fi.positions
)
def pfmt_callstack(frames: int = 1) -> str:
'''
Generate a string of nested `inspect.FrameInfo` objects returned
from a `inspect.stack()` call such that only the `.frame` field
for each layer is pprinted.
'''
caller_frames: list[FrameInfo] = stack()[1:1+frames]
frames_str: str = ''
for i, frame_info in enumerate(caller_frames):
frames_str += textwrap.indent(
f'{frame_info.frame!r}\n',
prefix=' '*i,
)
return frames_str

View File

@ -45,6 +45,8 @@ __all__ = ['pub']
log = get_logger('messaging')
# TODO! this needs to reworked to use the modern
# `Context`/`MsgStream` APIs!!
async def fan_out_to_ctxs(
pub_async_gen_func: typing.Callable, # it's an async gen ... gd mypy
topics2ctxs: dict[str, list],

View File

@ -0,0 +1,26 @@
# tractor: structured concurrent "actors".
# Copyright 2024-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
High level design patterns, APIs and runtime extensions built on top
of the `tractor` runtime core.
'''
from ._service import (
open_service_mngr as open_service_mngr,
get_service_mngr as get_service_mngr,
ServiceMngr as ServiceMngr,
)

View File

@ -0,0 +1,592 @@
# tractor: structured concurrent "actors".
# Copyright 2024-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Daemon subactor as service(s) management and supervision primitives
and API.
'''
from __future__ import annotations
from contextlib import (
asynccontextmanager as acm,
# contextmanager as cm,
)
from collections import defaultdict
from dataclasses import (
dataclass,
field,
)
import functools
import inspect
from typing import (
Callable,
Any,
)
import tractor
import trio
from trio import TaskStatus
from tractor import (
log,
ActorNursery,
current_actor,
ContextCancelled,
Context,
Portal,
)
log = log.get_logger('tractor')
# TODO: implement a `@singleton` deco-API for wrapping the below
# factory's impl for general actor-singleton use?
#
# -[ ] go through the options peeps on SO did?
# * https://stackoverflow.com/questions/6760685/what-is-the-best-way-of-implementing-singleton-in-python
# * including @mikenerone's answer
# |_https://stackoverflow.com/questions/6760685/what-is-the-best-way-of-implementing-singleton-in-python/39186313#39186313
#
# -[ ] put it in `tractor.lowlevel._globals` ?
# * fits with our oustanding actor-local/global feat req?
# |_ https://github.com/goodboy/tractor/issues/55
# * how can it relate to the `Actor.lifetime_stack` that was
# silently patched in?
# |_ we could implicitly call both of these in the same
# spot in the runtime using the lifetime stack?
# - `open_singleton_cm().__exit__()`
# -`del_singleton()`
# |_ gives SC fixtue semantics to sync code oriented around
# sub-process lifetime?
# * what about with `trio.RunVar`?
# |_https://trio.readthedocs.io/en/stable/reference-lowlevel.html#trio.lowlevel.RunVar
# - which we'll need for no-GIL cpython (right?) presuming
# multiple `trio.run()` calls in process?
#
#
# @singleton
# async def open_service_mngr(
# **init_kwargs,
# ) -> ServiceMngr:
# '''
# Note this function body is invoke IFF no existing singleton instance already
# exists in this proc's memory.
# '''
# # setup
# yield ServiceMngr(**init_kwargs)
# # teardown
# a deletion API for explicit instance de-allocation?
# @open_service_mngr.deleter
# def del_service_mngr() -> None:
# mngr = open_service_mngr._singleton[0]
# open_service_mngr._singleton[0] = None
# del mngr
# TODO: implement a singleton deco-API for wrapping the below
# factory's impl for general actor-singleton use?
#
# @singleton
# async def open_service_mngr(
# **init_kwargs,
# ) -> ServiceMngr:
# '''
# Note this function body is invoke IFF no existing singleton instance already
# exists in this proc's memory.
# '''
# # setup
# yield ServiceMngr(**init_kwargs)
# # teardown
# TODO: singleton factory API instead of a class API
@acm
async def open_service_mngr(
*,
debug_mode: bool = False,
# NOTE; since default values for keyword-args are effectively
# module-vars/globals as per the note from,
# https://docs.python.org/3/tutorial/controlflow.html#default-argument-values
#
# > "The default value is evaluated only once. This makes
# a difference when the default is a mutable object such as
# a list, dictionary, or instances of most classes"
#
_singleton: list[ServiceMngr|None] = [None],
**init_kwargs,
) -> ServiceMngr:
'''
Open an actor-global "service-manager" for supervising a tree
of subactors and/or actor-global tasks.
The delivered `ServiceMngr` is singleton instance for each
actor-process, that is, allocated on first open and never
de-allocated unless explicitly deleted by al call to
`del_service_mngr()`.
'''
# TODO: factor this an allocation into
# a `._mngr.open_service_mngr()` and put in the
# once-n-only-once setup/`.__aenter__()` part!
# -[ ] how to make this only happen on the `mngr == None` case?
# |_ use `.trionics.maybe_open_context()` (for generic
# async-with-style-only-once of the factory impl, though
# what do we do for the allocation case?
# / `.maybe_open_nursery()` (since for this specific case
# it's simpler?) to activate
async with (
tractor.open_nursery() as an,
trio.open_nursery() as tn,
):
# impl specific obvi..
init_kwargs.update({
'an': an,
'tn': tn,
})
mngr: ServiceMngr|None
if (mngr := _singleton[0]) is None:
log.info('Allocating a new service mngr!')
mngr = _singleton[0] = ServiceMngr(**init_kwargs)
# TODO: put into `.__aenter__()` section of
# eventual `@singleton_acm` API wrapper.
#
# assign globally for future daemon/task creation
mngr.an = an
mngr.tn = tn
else:
assert (mngr.an and mngr.tn)
log.info(
'Using extant service mngr!\n\n'
f'{mngr!r}\n' # it has a nice `.__repr__()` of services state
)
try:
# NOTE: this is a singleton factory impl specific detail
# which should be supported in the condensed
# `@singleton_acm` API?
mngr.debug_mode = debug_mode
yield mngr
finally:
# TODO: is this more clever/efficient?
# if 'samplerd' in mngr.service_ctxs:
# await mngr.cancel_service('samplerd')
tn.cancel_scope.cancel()
def get_service_mngr() -> ServiceMngr:
'''
Try to get the singleton service-mngr for this actor presuming it
has already been allocated using,
.. code:: python
async with open_<@singleton_acm(func)>() as mngr`
... this block kept open ...
If not yet allocated raise a `ServiceError`.
'''
# https://stackoverflow.com/a/12627202
# https://docs.python.org/3/library/inspect.html#inspect.Signature
maybe_mngr: ServiceMngr|None = inspect.signature(
open_service_mngr
).parameters['_singleton'].default[0]
if maybe_mngr is None:
raise RuntimeError(
'Someone must allocate a `ServiceMngr` using\n\n'
'`async with open_service_mngr()` beforehand!!\n'
)
return maybe_mngr
async def _open_and_supervise_service_ctx(
serman: ServiceMngr,
name: str,
ctx_fn: Callable, # TODO, type for `@tractor.context` requirement
portal: Portal,
allow_overruns: bool = False,
task_status: TaskStatus[
tuple[
trio.CancelScope,
Context,
trio.Event,
Any,
]
] = trio.TASK_STATUS_IGNORED,
**ctx_kwargs,
) -> Any:
'''
Open a remote IPC-context defined by `ctx_fn` in the
(service) actor accessed via `portal` and supervise the
(local) parent task to termination at which point the remote
actor runtime is cancelled alongside it.
The main application is for allocating long-running
"sub-services" in a main daemon and explicitly controlling
their lifetimes from an actor-global singleton.
'''
# TODO: use the ctx._scope directly here instead?
# -[ ] actually what semantics do we expect for this
# usage!?
with trio.CancelScope() as cs:
try:
async with portal.open_context(
ctx_fn,
allow_overruns=allow_overruns,
**ctx_kwargs,
) as (ctx, started):
# unblock once the remote context has started
complete = trio.Event()
task_status.started((
cs,
ctx,
complete,
started,
))
log.info(
f'`pikerd` service {name} started with value {started}'
)
# wait on any context's return value
# and any final portal result from the
# sub-actor.
ctx_res: Any = await ctx.wait_for_result()
# NOTE: blocks indefinitely until cancelled
# either by error from the target context
# function or by being cancelled here by the
# surrounding cancel scope.
return (
await portal.wait_for_result(),
ctx_res,
)
except ContextCancelled as ctxe:
canceller: tuple[str, str] = ctxe.canceller
our_uid: tuple[str, str] = current_actor().uid
if (
canceller != portal.chan.uid
and
canceller != our_uid
):
log.cancel(
f'Actor-service `{name}` was remotely cancelled by a peer?\n'
# TODO: this would be a good spot to use
# a respawn feature Bo
f'-> Keeping `pikerd` service manager alive despite this inter-peer cancel\n\n'
f'cancellee: {portal.chan.uid}\n'
f'canceller: {canceller}\n'
)
else:
raise
finally:
# NOTE: the ctx MUST be cancelled first if we
# don't want the above `ctx.wait_for_result()` to
# raise a self-ctxc. WHY, well since from the ctx's
# perspective the cancel request will have
# arrived out-out-of-band at the `Actor.cancel()`
# level, thus `Context.cancel_called == False`,
# meaning `ctx._is_self_cancelled() == False`.
# with trio.CancelScope(shield=True):
# await ctx.cancel()
await portal.cancel_actor() # terminate (remote) sub-actor
complete.set() # signal caller this task is done
serman.service_ctxs.pop(name) # remove mngr entry
# TODO: we need remote wrapping and a general soln:
# - factor this into a ``tractor.highlevel`` extension # pack for the
# library.
# - wrap a "remote api" wherein you can get a method proxy
# to the pikerd actor for starting services remotely!
# - prolly rename this to ActorServicesNursery since it spawns
# new actors and supervises them to completion?
@dataclass
class ServiceMngr:
'''
A multi-subactor-as-service manager.
Spawn, supervise and monitor service/daemon subactors in a SC
process tree.
'''
an: ActorNursery
tn: trio.Nursery
debug_mode: bool = False # tractor sub-actor debug mode flag
service_tasks: dict[
str,
tuple[
trio.CancelScope,
trio.Event,
]
] = field(default_factory=dict)
service_ctxs: dict[
str,
tuple[
trio.CancelScope,
Context,
Portal,
trio.Event,
]
] = field(default_factory=dict)
# internal per-service task mutexs
_locks = defaultdict(trio.Lock)
# TODO, unify this interface with our `TaskManager` PR!
#
#
async def start_service_task(
self,
name: str,
# TODO: typevar for the return type of the target and then
# use it below for `ctx_res`?
fn: Callable,
allow_overruns: bool = False,
**ctx_kwargs,
) -> tuple[
trio.CancelScope,
Any,
trio.Event,
]:
async def _task_manager_start(
task_status: TaskStatus[
tuple[
trio.CancelScope,
trio.Event,
]
] = trio.TASK_STATUS_IGNORED,
) -> Any:
task_cs = trio.CancelScope()
task_complete = trio.Event()
with task_cs as cs:
task_status.started((
cs,
task_complete,
))
try:
await fn()
except trio.Cancelled as taskc:
log.cancel(
f'Service task for `{name}` was cancelled!\n'
# TODO: this would be a good spot to use
# a respawn feature Bo
)
raise taskc
finally:
task_complete.set()
(
cs,
complete,
) = await self.tn.start(_task_manager_start)
# store the cancel scope and portal for later cancellation or
# retstart if needed.
self.service_tasks[name] = (
cs,
complete,
)
return (
cs,
complete,
)
async def cancel_service_task(
self,
name: str,
) -> Any:
log.info(f'Cancelling `pikerd` service {name}')
cs, complete = self.service_tasks[name]
cs.cancel()
await complete.wait()
# TODO, if we use the `TaskMngr` from #346
# we can also get the return value from the task!
if name in self.service_tasks:
# TODO: custom err?
# raise ServiceError(
raise RuntimeError(
f'Service task {name!r} not terminated!?\n'
)
async def start_service_ctx(
self,
name: str,
portal: Portal,
# TODO: typevar for the return type of the target and then
# use it below for `ctx_res`?
ctx_fn: Callable,
**ctx_kwargs,
) -> tuple[
trio.CancelScope,
Context,
Any,
]:
'''
Start a remote IPC-context defined by `ctx_fn` in a background
task and immediately return supervision primitives to manage it:
- a `cs: CancelScope` for the newly allocated bg task
- the `ipc_ctx: Context` to manage the remotely scheduled
`trio.Task`.
- the `started: Any` value returned by the remote endpoint
task's `Context.started(<value>)` call.
The bg task supervises the ctx such that when it terminates the supporting
actor runtime is also cancelled, see `_open_and_supervise_service_ctx()`
for details.
'''
cs, ipc_ctx, complete, started = await self.tn.start(
functools.partial(
_open_and_supervise_service_ctx,
serman=self,
name=name,
ctx_fn=ctx_fn,
portal=portal,
**ctx_kwargs,
)
)
# store the cancel scope and portal for later cancellation or
# retstart if needed.
self.service_ctxs[name] = (cs, ipc_ctx, portal, complete)
return (
cs,
ipc_ctx,
started,
)
async def start_service(
self,
daemon_name: str,
ctx_ep: Callable, # kwargs must `partial`-ed in!
# ^TODO, type for `@tractor.context` deco-ed funcs!
debug_mode: bool = False,
**start_actor_kwargs,
) -> Context:
'''
Start new subactor and schedule a supervising "service task"
in it which explicitly defines the sub's lifetime.
"Service daemon subactors" are cancelled (and thus
terminated) using the paired `.cancel_service()`.
Effectively this API can be used to manage "service daemons"
spawned under a single parent actor with supervision
semantics equivalent to a one-cancels-one style actor-nursery
or "(subactor) task manager" where each subprocess's (and
thus its embedded actor runtime) lifetime is synced to that
of the remotely spawned task defined by `ctx_ep`.
The funcionality can be likened to a "daemonized" version of
`.hilevel.worker.run_in_actor()` but with supervision
controls offered by `tractor.Context` where the main/root
remotely scheduled `trio.Task` invoking `ctx_ep` determines
the underlying subactor's lifetime.
'''
entry: tuple|None = self.service_ctxs.get(daemon_name)
if entry:
(cs, sub_ctx, portal, complete) = entry
return sub_ctx
if daemon_name not in self.service_ctxs:
portal: Portal = await self.an.start_actor(
daemon_name,
debug_mode=( # maybe set globally during allocate
debug_mode
or
self.debug_mode
),
**start_actor_kwargs,
)
ctx_kwargs: dict[str, Any] = {}
if isinstance(ctx_ep, functools.partial):
ctx_kwargs: dict[str, Any] = ctx_ep.keywords
ctx_ep: Callable = ctx_ep.func
(
cs,
sub_ctx,
started,
) = await self.start_service_ctx(
name=daemon_name,
portal=portal,
ctx_fn=ctx_ep,
**ctx_kwargs,
)
return sub_ctx
async def cancel_service(
self,
name: str,
) -> Any:
'''
Cancel the service task and actor for the given ``name``.
'''
log.info(f'Cancelling `pikerd` service {name}')
cs, sub_ctx, portal, complete = self.service_ctxs[name]
# cs.cancel()
await sub_ctx.cancel()
await complete.wait()
if name in self.service_ctxs:
# TODO: custom err?
# raise ServiceError(
raise RuntimeError(
f'Service actor for {name} not terminated and/or unknown?'
)
# assert name not in self.service_ctxs, \
# f'Serice task for {name} not terminated?'

View File

@ -0,0 +1,24 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
A modular IPC layer supporting the power of cross-process SC!
'''
from ._chan import (
_connect_chan as _connect_chan,
Channel as Channel
)

View File

@ -0,0 +1,503 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
"""
Inter-process comms abstractions
"""
from __future__ import annotations
from collections.abc import AsyncGenerator
from contextlib import (
asynccontextmanager as acm,
contextmanager as cm,
)
import platform
from pprint import pformat
import typing
from typing import (
Any,
TYPE_CHECKING,
)
import warnings
import trio
from ._types import (
transport_from_addr,
transport_from_stream,
)
from tractor._addr import (
is_wrapped_addr,
wrap_address,
Address,
UnwrappedAddress,
)
from tractor.log import get_logger
from tractor._exceptions import (
MsgTypeError,
pack_from_raise,
TransportClosed,
)
from tractor.msg import (
Aid,
MsgCodec,
)
if TYPE_CHECKING:
from ._transport import MsgTransport
log = get_logger(__name__)
_is_windows = platform.system() == 'Windows'
class Channel:
'''
An inter-process channel for communication between (remote) actors.
Wraps a ``MsgStream``: transport + encoding IPC connection.
Currently we only support ``trio.SocketStream`` for transport
(aka TCP) and the ``msgpack`` interchange format via the ``msgspec``
codec libary.
'''
def __init__(
self,
transport: MsgTransport|None = None,
# TODO: optional reconnection support?
# auto_reconnect: bool = False,
# on_reconnect: typing.Callable[..., typing.Awaitable] = None,
) -> None:
# self._recon_seq = on_reconnect
# self._autorecon = auto_reconnect
# Either created in ``.connect()`` or passed in by
# user in ``.from_stream()``.
self._transport: MsgTransport|None = transport
# set after handshake - always info from peer end
self.aid: Aid|None = None
self._aiter_msgs = self._iter_msgs()
self._exc: Exception|None = None
# ^XXX! ONLY set if a remote actor sends an `Error`-msg
self._closed: bool = False
# flag set by `Portal.cancel_actor()` indicating remote
# (possibly peer) cancellation of the far end actor runtime.
self._cancel_called: bool = False
@property
def closed(self) -> bool:
'''
Was `.aclose()` successfully called?
'''
return self._closed
@property
def cancel_called(self) -> bool:
'''
Set when `Portal.cancel_actor()` is called on a portal which
wraps this IPC channel.
'''
return self._cancel_called
@property
def uid(self) -> tuple[str, str]:
'''
Peer actor's unique id.
'''
msg: str = (
f'`{type(self).__name__}.uid` is now deprecated.\n'
'Use the new `.aid: tractor.msg.Aid` (struct) instead '
'which also provides additional named (optional) fields '
'beyond just the `.name` and `.uuid`.'
)
warnings.warn(
msg,
DeprecationWarning,
stacklevel=2,
)
peer_aid: Aid = self.aid
return (
peer_aid.name,
peer_aid.uuid,
)
@property
def stream(self) -> trio.abc.Stream | None:
return self._transport.stream if self._transport else None
@property
def msgstream(self) -> MsgTransport:
log.info(
'`Channel.msgstream` is an old name, use `._transport`'
)
return self._transport
@property
def transport(self) -> MsgTransport:
return self._transport
@classmethod
def from_stream(
cls,
stream: trio.abc.Stream,
) -> Channel:
transport_cls = transport_from_stream(stream)
return Channel(
transport=transport_cls(stream)
)
@classmethod
async def from_addr(
cls,
addr: UnwrappedAddress,
**kwargs
) -> Channel:
if not is_wrapped_addr(addr):
addr: Address = wrap_address(addr)
transport_cls = transport_from_addr(addr)
transport = await transport_cls.connect_to(
addr,
**kwargs,
)
# XXX, for UDS *no!* since we recv the peer-pid and build out
# a new addr..
# assert transport.raddr == addr
chan = Channel(transport=transport)
# ?TODO, compact this into adapter level-methods?
# -[ ] would avoid extra repr-calcs if level not active?
# |_ how would the `calc_if_level` look though? func?
if log.at_least_level('runtime'):
from tractor.devx import (
pformat as _pformat,
)
chan_repr: str = _pformat.nest_from_op(
input_op='[>',
text=chan.pformat(),
nest_indent=1,
)
log.runtime(
f'Connected channel IPC transport\n'
f'{chan_repr}'
)
return chan
@cm
def apply_codec(
self,
codec: MsgCodec,
) -> None:
'''
Temporarily override the underlying IPC msg codec for
dynamic enforcement of messaging schema.
'''
orig: MsgCodec = self._transport.codec
try:
self._transport.codec = codec
yield
finally:
self._transport.codec = orig
# TODO: do a .src/.dst: str for maddrs?
def pformat(
self,
privates: bool = False,
) -> str:
if not self._transport:
return '<Channel( with inactive transport? )>'
tpt: MsgTransport = self._transport
tpt_name: str = type(tpt).__name__
tpt_status: str = (
'connected' if self.connected()
else 'closed'
)
repr_str: str = (
f'<Channel(\n'
f' |_status: {tpt_status!r}\n'
) + (
f' _closed={self._closed}\n'
f' _cancel_called={self._cancel_called}\n'
if privates else ''
) + ( # peer-actor (processs) section
f' |_peer: {self.aid.reprol()!r}\n'
if self.aid else ' |_peer: <unknown>\n'
) + (
f' |_msgstream: {tpt_name}\n'
f' maddr: {tpt.maddr!r}\n'
f' proto: {tpt.laddr.proto_key!r}\n'
f' layer: {tpt.layer_key!r}\n'
f' codec: {tpt.codec_key!r}\n'
f' .laddr={tpt.laddr}\n'
f' .raddr={tpt.raddr}\n'
) + (
f' ._transport.stream={tpt.stream}\n'
f' ._transport.drained={tpt.drained}\n'
if privates else ''
) + (
f' _send_lock={tpt._send_lock.statistics()}\n'
if privates else ''
) + (
')>\n'
)
return repr_str
# NOTE: making this return a value that can be passed to
# `eval()` is entirely **optional** FYI!
# https://docs.python.org/3/library/functions.html#repr
# https://docs.python.org/3/reference/datamodel.html#object.__repr__
#
# Currently we target **readability** from a (console)
# logging perspective over `eval()`-ability since we do NOT
# target serializing non-struct instances!
# def __repr__(self) -> str:
__str__ = pformat
__repr__ = pformat
@property
def laddr(self) -> Address|None:
return self._transport.laddr if self._transport else None
@property
def raddr(self) -> Address|None:
return self._transport.raddr if self._transport else None
@property
def maddr(self) -> str:
return self._transport.maddr if self._transport else '<no-tpt>'
# TODO: something like,
# `pdbp.hideframe_on(errors=[MsgTypeError])`
# instead of the `try/except` hack we have rn..
# seems like a pretty useful thing to have in general
# along with being able to filter certain stack frame(s / sets)
# possibly based on the current log-level?
async def send(
self,
payload: Any,
hide_tb: bool = False,
) -> None:
'''
Send a coded msg-blob over the transport.
'''
__tracebackhide__: bool = hide_tb
try:
log.transport(
'=> send IPC msg:\n\n'
f'{pformat(payload)}\n'
)
# assert self._transport # but why typing?
await self._transport.send(
payload,
hide_tb=hide_tb,
)
except (
BaseException,
MsgTypeError,
TransportClosed,
)as _err:
err = _err # bind for introspection
match err:
case MsgTypeError():
try:
assert err.cid
except KeyError:
raise err
case TransportClosed():
log.transport(
f'Transport stream closed due to\n'
f'{err.repr_src_exc()}\n'
)
case _:
# never suppress non-tpt sources
__tracebackhide__: bool = False
raise
async def recv(self) -> Any:
assert self._transport
return await self._transport.recv()
# TODO: auto-reconnect features like 0mq/nanomsg?
# -[ ] implement it manually with nods to SC prot
# possibly on multiple transport backends?
# -> seems like that might be re-inventing scalability
# prots tho no?
# try:
# return await self._transport.recv()
# except trio.BrokenResourceError:
# if self._autorecon:
# await self._reconnect()
# return await self.recv()
# raise
async def aclose(self) -> None:
log.transport(
f'Closing channel to {self.aid} '
f'{self.laddr} -> {self.raddr}'
)
assert self._transport
await self._transport.stream.aclose()
self._closed = True
async def __aenter__(self):
await self.connect()
return self
async def __aexit__(self, *args):
await self.aclose(*args)
def __aiter__(self):
return self._aiter_msgs
# ?TODO? run any reconnection sequence?
# -[ ] prolly should be impl-ed as deco-API?
#
# async def _reconnect(self) -> None:
# """Handle connection failures by polling until a reconnect can be
# established.
# """
# down = False
# while True:
# try:
# with trio.move_on_after(3) as cancel_scope:
# await self.connect()
# cancelled = cancel_scope.cancelled_caught
# if cancelled:
# log.transport(
# "Reconnect timed out after 3 seconds, retrying...")
# continue
# else:
# log.transport("Stream connection re-established!")
# # on_recon = self._recon_seq
# # if on_recon:
# # await on_recon(self)
# break
# except (OSError, ConnectionRefusedError):
# if not down:
# down = True
# log.transport(
# f"Connection to {self.raddr} went down, waiting"
# " for re-establishment")
# await trio.sleep(1)
async def _iter_msgs(
self
) -> AsyncGenerator[Any, None]:
'''
Yield `MsgType` IPC msgs decoded and deliverd from
an underlying `MsgTransport` protocol.
This is a streaming routine alo implemented as an async-gen
func (same a `MsgTransport._iter_pkts()`) gets allocated by
a `.__call__()` inside `.__init__()` where it is assigned to
the `._aiter_msgs` attr.
'''
assert self._transport
while True:
try:
async for msg in self._transport:
match msg:
# NOTE: if transport/interchange delivers
# a type error, we pack it with the far
# end peer `Actor.uid` and relay the
# `Error`-msg upward to the `._rpc` stack
# for normal RAE handling.
case MsgTypeError():
yield pack_from_raise(
local_err=msg,
cid=msg.cid,
# XXX we pack it here bc lower
# layers have no notion of an
# actor-id ;)
src_uid=self.uid,
)
case _:
yield msg
except trio.BrokenResourceError:
# if not self._autorecon:
raise
await self.aclose()
# if self._autorecon: # attempt reconnect
# await self._reconnect()
# continue
def connected(self) -> bool:
return self._transport.connected() if self._transport else False
async def _do_handshake(
self,
aid: Aid,
) -> Aid:
'''
Exchange `(name, UUIDs)` identifiers as the first
communication step with any (peer) remote `Actor`.
These are essentially the "mailbox addresses" found in
"actor model" parlance.
'''
await self.send(aid)
peer_aid: Aid = await self.recv()
log.runtime(
f'Received hanshake with peer\n'
f'<= {peer_aid.reprol(sin_uuid=False)}\n'
)
# NOTE, we always are referencing the remote peer!
self.aid = peer_aid
return peer_aid
@acm
async def _connect_chan(
addr: UnwrappedAddress
) -> typing.AsyncGenerator[Channel, None]:
'''
Create and connect a channel with disconnect on context manager
teardown.
'''
chan = await Channel.from_addr(addr)
yield chan
with trio.CancelScope(shield=True):
await chan.aclose()

View File

@ -0,0 +1,163 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
File-descriptor-sharing on `linux` by "wilhelm_of_bohemia".
'''
from __future__ import annotations
import os
import array
import socket
import tempfile
from pathlib import Path
from contextlib import ExitStack
import trio
import tractor
from tractor.ipc import RBToken
actor_name = 'ringd'
_rings: dict[str, dict] = {}
async def _attach_to_ring(
ring_name: str
) -> tuple[int, int, int]:
actor = tractor.current_actor()
fd_amount = 3
sock_path = (
Path(tempfile.gettempdir())
/
f'{os.getpid()}-pass-ring-fds-{ring_name}-to-{actor.name}.sock'
)
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
sock.bind(sock_path)
sock.listen(1)
async with (
tractor.find_actor(actor_name) as ringd,
ringd.open_context(
_pass_fds,
name=ring_name,
sock_path=sock_path
) as (ctx, _sent)
):
# prepare array to receive FD
fds = array.array("i", [0] * fd_amount)
conn, _ = sock.accept()
# receive FD
msg, ancdata, flags, addr = conn.recvmsg(
1024,
socket.CMSG_LEN(fds.itemsize * fd_amount)
)
for (
cmsg_level,
cmsg_type,
cmsg_data,
) in ancdata:
if (
cmsg_level == socket.SOL_SOCKET
and
cmsg_type == socket.SCM_RIGHTS
):
fds.frombytes(cmsg_data[:fds.itemsize * fd_amount])
break
else:
raise RuntimeError("Receiver: No FDs received")
conn.close()
sock.close()
sock_path.unlink()
return RBToken.from_msg(
await ctx.wait_for_result()
)
@tractor.context
async def _pass_fds(
ctx: tractor.Context,
name: str,
sock_path: str
) -> RBToken:
global _rings
token = _rings[name]
client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
client.connect(sock_path)
await ctx.started()
fds = array.array('i', token.fds)
client.sendmsg([b'FDs'], [(socket.SOL_SOCKET, socket.SCM_RIGHTS, fds)])
client.close()
return token
@tractor.context
async def _open_ringbuf(
ctx: tractor.Context,
name: str,
buf_size: int
) -> RBToken:
global _rings
is_owner = False
if name not in _rings:
stack = ExitStack()
token = stack.enter_context(
tractor.open_ringbuf(
name,
buf_size=buf_size
)
)
_rings[name] = {
'token': token,
'stack': stack,
}
is_owner = True
ring = _rings[name]
await ctx.started()
try:
await trio.sleep_forever()
except tractor.ContextCancelled:
...
finally:
if is_owner:
ring['stack'].close()
async def open_ringbuf(
name: str,
buf_size: int
) -> RBToken:
async with (
tractor.find_actor(actor_name) as ringd,
ringd.open_context(
_open_ringbuf,
name=name,
buf_size=buf_size
) as (rd_ctx, _)
):
yield await _attach_to_ring(name)
await rd_ctx.cancel()

View File

@ -0,0 +1,153 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Linux specifics, for now we are only exposing EventFD
'''
import os
import errno
import cffi
import trio
ffi = cffi.FFI()
# Declare the C functions and types we plan to use.
# - eventfd: for creating the event file descriptor
# - write: for writing to the file descriptor
# - read: for reading from the file descriptor
# - close: for closing the file descriptor
ffi.cdef(
'''
int eventfd(unsigned int initval, int flags);
ssize_t write(int fd, const void *buf, size_t count);
ssize_t read(int fd, void *buf, size_t count);
int close(int fd);
'''
)
# Open the default dynamic library (essentially 'libc' in most cases)
C = ffi.dlopen(None)
# Constants from <sys/eventfd.h>, if needed.
EFD_SEMAPHORE = 1
EFD_CLOEXEC = 0o2000000
EFD_NONBLOCK = 0o4000
def open_eventfd(initval: int = 0, flags: int = 0) -> int:
'''
Open an eventfd with the given initial value and flags.
Returns the file descriptor on success, otherwise raises OSError.
'''
fd = C.eventfd(initval, flags)
if fd < 0:
raise OSError(errno.errorcode[ffi.errno], 'eventfd failed')
return fd
def write_eventfd(fd: int, value: int) -> int:
'''
Write a 64-bit integer (uint64_t) to the eventfd's counter.
'''
# Create a uint64_t* in C, store `value`
data_ptr = ffi.new('uint64_t *', value)
# Call write(fd, data_ptr, 8)
# We expect to write exactly 8 bytes (sizeof(uint64_t))
ret = C.write(fd, data_ptr, 8)
if ret < 0:
raise OSError(errno.errorcode[ffi.errno], 'write to eventfd failed')
return ret
def read_eventfd(fd: int) -> int:
'''
Read a 64-bit integer (uint64_t) from the eventfd, returning the value.
Reading resets the counter to 0 (unless using EFD_SEMAPHORE).
'''
# Allocate an 8-byte buffer in C for reading
buf = ffi.new('char[]', 8)
ret = C.read(fd, buf, 8)
if ret < 0:
raise OSError(errno.errorcode[ffi.errno], 'read from eventfd failed')
# Convert the 8 bytes we read into a Python integer
data_bytes = ffi.unpack(buf, 8) # returns a Python bytes object of length 8
value = int.from_bytes(data_bytes, byteorder='little', signed=False)
return value
def close_eventfd(fd: int) -> int:
'''
Close the eventfd.
'''
ret = C.close(fd)
if ret < 0:
raise OSError(errno.errorcode[ffi.errno], 'close failed')
class EventFD:
'''
Use a previously opened eventfd(2), meant to be used in
sub-actors after root actor opens the eventfds then passes
them through pass_fds
'''
def __init__(
self,
fd: int,
omode: str
):
self._fd: int = fd
self._omode: str = omode
self._fobj = None
@property
def fd(self) -> int | None:
return self._fd
def write(self, value: int) -> int:
return write_eventfd(self._fd, value)
async def read(self) -> int:
return await trio.to_thread.run_sync(
read_eventfd, self._fd,
abandon_on_cancel=True
)
def open(self):
self._fobj = os.fdopen(self._fd, self._omode)
def close(self):
if self._fobj:
self._fobj.close()
def __enter__(self):
self.open()
return self
def __exit__(self, exc_type, exc_value, traceback):
self.close()

View File

@ -0,0 +1,75 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Utils to tame mp non-SC madeness
'''
import platform
def disable_mantracker():
'''
Disable all `multiprocessing` "resource tracking" machinery since
it's an absolute multi-threaded mess of non-SC madness.
'''
from multiprocessing.shared_memory import SharedMemory
# 3.13+ only.. can pass `track=False` to disable
# all the resource tracker bs.
# https://docs.python.org/3/library/multiprocessing.shared_memory.html
if (_py_313 := (
platform.python_version_tuple()[:-1]
>=
('3', '13')
)
):
from functools import partial
return partial(
SharedMemory,
track=False,
)
# !TODO, once we drop 3.12- we can obvi remove all this!
else:
from multiprocessing import (
resource_tracker as mantracker,
)
# Tell the "resource tracker" thing to fuck off.
class ManTracker(mantracker.ResourceTracker):
def register(self, name, rtype):
pass
def unregister(self, name, rtype):
pass
def ensure_running(self):
pass
# "know your land and know your prey"
# https://www.dailymotion.com/video/x6ozzco
mantracker._resource_tracker = ManTracker()
mantracker.register = mantracker._resource_tracker.register
mantracker.ensure_running = mantracker._resource_tracker.ensure_running
mantracker.unregister = mantracker._resource_tracker.unregister
mantracker.getfd = mantracker._resource_tracker.getfd
# use std type verbatim
shmT = SharedMemory
return shmT

View File

@ -0,0 +1,253 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
IPC Reliable RingBuffer implementation
'''
from __future__ import annotations
from contextlib import contextmanager as cm
from multiprocessing.shared_memory import SharedMemory
import trio
from msgspec import (
Struct,
to_builtins
)
from ._linux import (
EFD_NONBLOCK,
open_eventfd,
EventFD
)
from ._mp_bs import disable_mantracker
disable_mantracker()
class RBToken(Struct, frozen=True):
'''
RingBuffer token contains necesary info to open the two
eventfds and the shared memory
'''
shm_name: str
write_eventfd: int
wrap_eventfd: int
buf_size: int
def as_msg(self):
return to_builtins(self)
@classmethod
def from_msg(cls, msg: dict) -> RBToken:
if isinstance(msg, RBToken):
return msg
return RBToken(**msg)
@cm
def open_ringbuf(
shm_name: str,
buf_size: int = 10 * 1024,
write_efd_flags: int = 0,
wrap_efd_flags: int = 0
) -> RBToken:
shm = SharedMemory(
name=shm_name,
size=buf_size,
create=True
)
try:
token = RBToken(
shm_name=shm_name,
write_eventfd=open_eventfd(flags=write_efd_flags),
wrap_eventfd=open_eventfd(flags=wrap_efd_flags),
buf_size=buf_size
)
yield token
finally:
shm.unlink()
class RingBuffSender(trio.abc.SendStream):
'''
IPC Reliable Ring Buffer sender side implementation
`eventfd(2)` is used for wrap around sync, and also to signal
writes to the reader.
'''
def __init__(
self,
token: RBToken,
start_ptr: int = 0,
):
token = RBToken.from_msg(token)
self._shm = SharedMemory(
name=token.shm_name,
size=token.buf_size,
create=False
)
self._write_event = EventFD(token.write_eventfd, 'w')
self._wrap_event = EventFD(token.wrap_eventfd, 'r')
self._ptr = start_ptr
@property
def key(self) -> str:
return self._shm.name
@property
def size(self) -> int:
return self._shm.size
@property
def ptr(self) -> int:
return self._ptr
@property
def write_fd(self) -> int:
return self._write_event.fd
@property
def wrap_fd(self) -> int:
return self._wrap_event.fd
async def send_all(self, data: bytes | bytearray | memoryview):
# while data is larger than the remaining buf
target_ptr = self.ptr + len(data)
while target_ptr > self.size:
# write all bytes that fit
remaining = self.size - self.ptr
self._shm.buf[self.ptr:] = data[:remaining]
# signal write and wait for reader wrap around
self._write_event.write(remaining)
await self._wrap_event.read()
# wrap around and trim already written bytes
self._ptr = 0
data = data[remaining:]
target_ptr = self._ptr + len(data)
# remaining data fits on buffer
self._shm.buf[self.ptr:target_ptr] = data
self._write_event.write(len(data))
self._ptr = target_ptr
async def wait_send_all_might_not_block(self):
raise NotImplementedError
async def aclose(self):
self._write_event.close()
self._wrap_event.close()
self._shm.close()
async def __aenter__(self):
self._write_event.open()
self._wrap_event.open()
return self
class RingBuffReceiver(trio.abc.ReceiveStream):
'''
IPC Reliable Ring Buffer receiver side implementation
`eventfd(2)` is used for wrap around sync, and also to signal
writes to the reader.
'''
def __init__(
self,
token: RBToken,
start_ptr: int = 0,
flags: int = 0
):
token = RBToken.from_msg(token)
self._shm = SharedMemory(
name=token.shm_name,
size=token.buf_size,
create=False
)
self._write_event = EventFD(token.write_eventfd, 'w')
self._wrap_event = EventFD(token.wrap_eventfd, 'r')
self._ptr = start_ptr
self._flags = flags
@property
def key(self) -> str:
return self._shm.name
@property
def size(self) -> int:
return self._shm.size
@property
def ptr(self) -> int:
return self._ptr
@property
def write_fd(self) -> int:
return self._write_event.fd
@property
def wrap_fd(self) -> int:
return self._wrap_event.fd
async def receive_some(
self,
max_bytes: int | None = None,
nb_timeout: float = 0.1
) -> memoryview:
# if non blocking eventfd enabled, do polling
# until next write, this allows signal handling
if self._flags | EFD_NONBLOCK:
delta = None
while delta is None:
try:
delta = await self._write_event.read()
except OSError as e:
if e.errno == 'EAGAIN':
continue
raise e
else:
delta = await self._write_event.read()
# fetch next segment and advance ptr
next_ptr = self._ptr + delta
segment = self._shm.buf[self._ptr:next_ptr]
self._ptr = next_ptr
if self.ptr == self.size:
# reached the end, signal wrap around
self._ptr = 0
self._wrap_event.write(1)
return segment
async def aclose(self):
self._write_event.close()
self._wrap_event.close()
self._shm.close()
async def __aenter__(self):
self._write_event.open()
self._wrap_event.open()
return self

File diff suppressed because it is too large Load Diff

View File

@ -23,19 +23,24 @@ considered optional within the context of this runtime-library.
"""
from __future__ import annotations
from multiprocessing import shared_memory as shm
from multiprocessing.shared_memory import (
# SharedMemory,
ShareableList,
)
import platform
from sys import byteorder
import time
from typing import Optional
from multiprocessing import shared_memory as shm
from multiprocessing.shared_memory import (
SharedMemory,
ShareableList,
)
from msgspec import Struct
from msgspec import (
Struct,
to_builtins
)
import tractor
from .log import get_logger
from tractor.ipc._mp_bs import disable_mantracker
from tractor.log import get_logger
_USE_POSIX = getattr(shm, '_USE_POSIX', False)
@ -46,7 +51,10 @@ if _USE_POSIX:
try:
import numpy as np
from numpy.lib import recfunctions as rfn
import nptyping
# TODO ruff complains with,
# warning| F401: `nptyping` imported but unused; consider using
# `importlib.util.find_spec` to test for availability
import nptyping # noqa
except ImportError:
pass
@ -54,35 +62,7 @@ except ImportError:
log = get_logger(__name__)
def disable_mantracker():
'''
Disable all ``multiprocessing``` "resource tracking" machinery since
it's an absolute multi-threaded mess of non-SC madness.
'''
from multiprocessing import resource_tracker as mantracker
# Tell the "resource tracker" thing to fuck off.
class ManTracker(mantracker.ResourceTracker):
def register(self, name, rtype):
pass
def unregister(self, name, rtype):
pass
def ensure_running(self):
pass
# "know your land and know your prey"
# https://www.dailymotion.com/video/x6ozzco
mantracker._resource_tracker = ManTracker()
mantracker.register = mantracker._resource_tracker.register
mantracker.ensure_running = mantracker._resource_tracker.ensure_running
mantracker.unregister = mantracker._resource_tracker.unregister
mantracker.getfd = mantracker._resource_tracker.getfd
disable_mantracker()
SharedMemory = disable_mantracker()
class SharedInt:
@ -142,7 +122,7 @@ class NDToken(Struct, frozen=True):
).descr
def as_msg(self):
return self.to_dict()
return to_builtins(self)
@classmethod
def from_msg(cls, msg: dict) -> NDToken:
@ -810,11 +790,23 @@ def open_shm_list(
readonly=readonly,
)
# TODO, factor into a @actor_fixture acm-API?
# -[ ] also `@maybe_actor_fixture()` which inludes
# the .current_actor() convenience check?
# |_ orr can that just be in the sin-maybe-version?
#
# "close" attached shm on actor teardown
try:
actor = tractor.current_actor()
actor.lifetime_stack.callback(shml.shm.close)
actor.lifetime_stack.callback(shml.shm.unlink)
# XXX on 3.13+ we don't need to call this?
# -> bc we pass `track=False` for `SharedMemeory` orr?
if (
platform.python_version_tuple()[:-1] < ('3', '13')
):
actor.lifetime_stack.callback(shml.shm.unlink)
except RuntimeError:
log.warning('tractor runtime not active, skipping teardown steps')

254
tractor/ipc/_tcp.py 100644
View File

@ -0,0 +1,254 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
TCP implementation of tractor.ipc._transport.MsgTransport protocol
'''
from __future__ import annotations
import ipaddress
from typing import (
ClassVar,
)
# from contextlib import (
# asynccontextmanager as acm,
# )
import msgspec
import trio
from trio import (
SocketListener,
open_tcp_listeners,
)
from tractor.msg import MsgCodec
from tractor.log import get_logger
from tractor.ipc._transport import (
MsgTransport,
MsgpackTransport,
)
log = get_logger(__name__)
class TCPAddress(
msgspec.Struct,
frozen=True,
):
_host: str
_port: int
def __post_init__(self):
try:
ipaddress.ip_address(self._host)
except ValueError as valerr:
raise ValueError(
'Invalid {type(self).__name__}._host = {self._host!r}\n'
) from valerr
proto_key: ClassVar[str] = 'tcp'
unwrapped_type: ClassVar[type] = tuple[str, int]
def_bindspace: ClassVar[str] = '127.0.0.1'
# ?TODO, actually validate ipv4/6 with stdlib's `ipaddress`
@property
def is_valid(self) -> bool:
'''
Predicate to ensure a valid socket-address pair.
'''
return (
self._port != 0
and
(ipaddr := ipaddress.ip_address(self._host))
and not (
ipaddr.is_reserved
or
ipaddr.is_unspecified
or
ipaddr.is_link_local
or
ipaddr.is_link_local
or
ipaddr.is_multicast
or
ipaddr.is_global
)
)
# ^XXX^ see various properties of invalid addrs here,
# https://docs.python.org/3/library/ipaddress.html#ipaddress.IPv4Address
@property
def bindspace(self) -> str:
return self._host
@property
def domain(self) -> str:
return self._host
@classmethod
def from_addr(
cls,
addr: tuple[str, int]
) -> TCPAddress:
match addr:
case (str(), int()):
return TCPAddress(addr[0], addr[1])
case _:
raise ValueError(
f'Invalid unwrapped address for {cls}\n'
f'{addr}\n'
)
def unwrap(self) -> tuple[str, int]:
return (
self._host,
self._port,
)
@classmethod
def get_random(
cls,
bindspace: str = def_bindspace,
) -> TCPAddress:
return TCPAddress(bindspace, 0)
@classmethod
def get_root(cls) -> TCPAddress:
return TCPAddress(
'127.0.0.1',
1616,
)
def __repr__(self) -> str:
return (
f'{type(self).__name__}[{self.unwrap()}]'
)
@classmethod
def get_transport(
cls,
codec: str = 'msgpack',
) -> MsgTransport:
match codec:
case 'msgspack':
return MsgpackTCPStream
case _:
raise ValueError(
f'No IPC transport with {codec!r} supported !'
)
async def start_listener(
addr: TCPAddress,
**kwargs,
) -> SocketListener:
'''
Start a TCP socket listener on the given `TCPAddress`.
'''
log.runtime(
f'Trying socket bind\n'
f'>[ {addr}\n'
)
# ?TODO, maybe we should just change the lower-level call this is
# using internall per-listener?
listeners: list[SocketListener] = await open_tcp_listeners(
host=addr._host,
port=addr._port,
**kwargs
)
# NOTE, for now we don't expect non-singleton-resolving
# domain-addresses/multi-homed-hosts.
# (though it is supported by `open_tcp_listeners()`)
assert len(listeners) == 1
listener = listeners[0]
host, port = listener.socket.getsockname()[:2]
bound_addr: TCPAddress = type(addr).from_addr((host, port))
log.info(
f'Listening on TCP socket\n'
f'[> {bound_addr}\n'
)
return listener
# TODO: typing oddity.. not sure why we have to inherit here, but it
# seems to be an issue with `get_msg_transport()` returning
# a `Type[Protocol]`; probably should make a `mypy` issue?
class MsgpackTCPStream(MsgpackTransport):
'''
A ``trio.SocketStream`` delivering ``msgpack`` formatted data
using the ``msgspec`` codec lib.
'''
address_type = TCPAddress
layer_key: int = 4
@property
def maddr(self) -> str:
host, port = self.raddr.unwrap()
return (
# TODO, use `ipaddress` from stdlib to handle
# first detecting which of `ipv4/6` before
# choosing the routing prefix part.
f'/ipv4/{host}'
f'/{self.address_type.proto_key}/{port}'
# f'/{self.chan.uid[0]}'
# f'/{self.cid}'
# f'/cid={cid_head}..{cid_tail}'
# TODO: ? not use this ^ right ?
)
def connected(self) -> bool:
return self.stream.socket.fileno() != -1
@classmethod
async def connect_to(
cls,
destaddr: TCPAddress,
prefix_size: int = 4,
codec: MsgCodec|None = None,
**kwargs
) -> MsgpackTCPStream:
stream = await trio.open_tcp_stream(
*destaddr.unwrap(),
**kwargs
)
return MsgpackTCPStream(
stream,
prefix_size=prefix_size,
codec=codec
)
@classmethod
def get_stream_addrs(
cls,
stream: trio.SocketStream
) -> tuple[
TCPAddress,
TCPAddress,
]:
# TODO, what types are these?
lsockname = stream.socket.getsockname()
l_sockaddr: tuple[str, int] = tuple(lsockname[:2])
rsockname = stream.socket.getpeername()
r_sockaddr: tuple[str, int] = tuple(rsockname[:2])
return (
TCPAddress.from_addr(l_sockaddr),
TCPAddress.from_addr(r_sockaddr),
)

View File

@ -0,0 +1,536 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
typing.Protocol based generic msg API, implement this class to add
backends for tractor.ipc.Channel
'''
from __future__ import annotations
from typing import (
runtime_checkable,
Type,
Protocol,
# TypeVar,
ClassVar,
TYPE_CHECKING,
)
from collections.abc import (
AsyncGenerator,
AsyncIterator,
)
import struct
import trio
import msgspec
from tricycle import BufferedReceiveStream
from tractor.log import get_logger
from tractor._exceptions import (
MsgTypeError,
TransportClosed,
_mk_send_mte,
_mk_recv_mte,
)
from tractor.msg import (
_ctxvar_MsgCodec,
# _codec, XXX see `self._codec` sanity/debug checks
MsgCodec,
MsgType,
types as msgtypes,
pretty_struct,
)
if TYPE_CHECKING:
from tractor._addr import Address
log = get_logger(__name__)
# (codec, transport)
MsgTransportKey = tuple[str, str]
# from tractor.msg.types import MsgType
# ?TODO? this should be our `Union[*msgtypes.__spec__]` alias now right..?
# => BLEH, except can't bc prots must inherit typevar or param-spec
# vars..
# MsgType = TypeVar('MsgType')
@runtime_checkable
class MsgTransport(Protocol):
#
# class MsgTransport(Protocol[MsgType]):
# ^-TODO-^ consider using a generic def and indexing with our
# eventual msg definition/types?
# - https://docs.python.org/3/library/typing.html#typing.Protocol
stream: trio.SocketStream
drained: list[MsgType]
address_type: ClassVar[Type[Address]]
codec_key: ClassVar[str]
# XXX: should this instead be called `.sendall()`?
async def send(self, msg: MsgType) -> None:
...
async def recv(self) -> MsgType:
...
def __aiter__(self) -> MsgType:
...
def connected(self) -> bool:
...
# defining this sync otherwise it causes a mypy error because it
# can't figure out it's a generator i guess?..?
def drain(self) -> AsyncIterator[dict]:
...
@classmethod
def key(cls) -> MsgTransportKey:
return (
cls.codec_key,
cls.address_type.proto_key,
)
@property
def laddr(self) -> Address:
...
@property
def raddr(self) -> Address:
...
@property
def maddr(self) -> str:
...
@classmethod
async def connect_to(
cls,
addr: Address,
**kwargs
) -> MsgTransport:
...
@classmethod
def get_stream_addrs(
cls,
stream: trio.abc.Stream
) -> tuple[
Address, # local
Address # remote
]:
'''
Return the transport protocol's address pair for the local
and remote-peer side.
'''
...
# TODO, such that all `.raddr`s for each `SocketStream` are
# delivered?
# -[ ] move `.open_listener()` here and internally track the
# listener set, per address?
# def get_peers(
# self,
# ) -> list[Address]:
# ...
class MsgpackTransport(MsgTransport):
# TODO: better naming for this?
# -[ ] check how libp2p does naming for such things?
codec_key: str = 'msgpack'
def __init__(
self,
stream: trio.abc.Stream,
prefix_size: int = 4,
# XXX optionally provided codec pair for `msgspec`:
# https://jcristharif.com/msgspec/extending.html#mapping-to-from-native-types
#
# TODO: define this as a `Codec` struct which can be
# overriden dynamically by the application/runtime?
codec: MsgCodec = None,
) -> None:
self.stream = stream
(
self._laddr,
self._raddr,
) = self.get_stream_addrs(stream)
# create read loop instance
self._aiter_pkts = self._iter_packets()
self._send_lock = trio.StrictFIFOLock()
# public i guess?
self.drained: list[dict] = []
self.recv_stream = BufferedReceiveStream(
transport_stream=stream
)
self.prefix_size = prefix_size
# allow for custom IPC msg interchange format
# dynamic override Bo
self._task = trio.lowlevel.current_task()
# XXX for ctxvar debug only!
# self._codec: MsgCodec = (
# codec
# or
# _codec._ctxvar_MsgCodec.get()
# )
async def _iter_packets(self) -> AsyncGenerator[dict, None]:
'''
Yield `bytes`-blob decoded packets from the underlying TCP
stream using the current task's `MsgCodec`.
This is a streaming routine implemented as an async generator
func (which was the original design, but could be changed?)
and is allocated by a `.__call__()` inside `.__init__()` where
it is assigned to the `._aiter_pkts` attr.
'''
decodes_failed: int = 0
tpt_name: str = f'{type(self).__name__!r}'
while True:
try:
header: bytes = await self.recv_stream.receive_exactly(4)
except (
ValueError,
ConnectionResetError,
# not sure entirely why we need this but without it we
# seem to be getting racy failures here on
# arbiter/registry name subs..
trio.BrokenResourceError,
) as trans_err:
loglevel = 'transport'
match trans_err:
# case (
# ConnectionResetError()
# ):
# loglevel = 'transport'
# peer actor (graceful??) TCP EOF but `tricycle`
# seems to raise a 0-bytes-read?
case ValueError() if (
'unclean EOF' in trans_err.args[0]
):
pass
# peer actor (task) prolly shutdown quickly due
# to cancellation
case trio.BrokenResourceError() if (
'Connection reset by peer' in trans_err.args[0]
):
pass
# unless the disconnect condition falls under "a
# normal operation breakage" we usualy console warn
# about it.
case _:
loglevel: str = 'warning'
raise TransportClosed(
message=(
f'{tpt_name} already closed by peer\n'
),
src_exc=trans_err,
loglevel=loglevel,
) from trans_err
# XXX definitely can happen if transport is closed
# manually by another `trio.lowlevel.Task` in the
# same actor; we use this in some simulated fault
# testing for ex, but generally should never happen
# under normal operation!
#
# NOTE: as such we always re-raise this error from the
# RPC msg loop!
except trio.ClosedResourceError as cre:
closure_err = cre
raise TransportClosed(
message=(
f'{tpt_name} was already closed locally ?\n'
),
src_exc=closure_err,
loglevel='error',
raise_on_report=(
'another task closed this fd' in closure_err.args
),
) from closure_err
# graceful TCP EOF disconnect
if header == b'':
raise TransportClosed(
message=(
f'{tpt_name} already gracefully closed\n'
),
loglevel='transport',
)
size: int
size, = struct.unpack("<I", header)
log.transport(f'received header {size}') # type: ignore
msg_bytes: bytes = await self.recv_stream.receive_exactly(size)
log.transport(f"received {msg_bytes}") # type: ignore
try:
# NOTE: lookup the `trio.Task.context`'s var for
# the current `MsgCodec`.
codec: MsgCodec = _ctxvar_MsgCodec.get()
# XXX for ctxvar debug only!
# if self._codec.pld_spec != codec.pld_spec:
# assert (
# task := trio.lowlevel.current_task()
# ) is not self._task
# self._task = task
# self._codec = codec
# log.runtime(
# f'Using new codec in {self}.recv()\n'
# f'codec: {self._codec}\n\n'
# f'msg_bytes: {msg_bytes}\n'
# )
yield codec.decode(msg_bytes)
# XXX NOTE: since the below error derives from
# `DecodeError` we need to catch is specially
# and always raise such that spec violations
# are never allowed to be caught silently!
except msgspec.ValidationError as verr:
msgtyperr: MsgTypeError = _mk_recv_mte(
msg=msg_bytes,
codec=codec,
src_validation_error=verr,
)
# XXX deliver up to `Channel.recv()` where
# a re-raise and `Error`-pack can inject the far
# end actor `.uid`.
yield msgtyperr
except (
msgspec.DecodeError,
UnicodeDecodeError,
):
if decodes_failed < 4:
# ignore decoding errors for now and assume they have to
# do with a channel drop - hope that receiving from the
# channel will raise an expected error and bubble up.
try:
msg_str: str|bytes = msg_bytes.decode()
except UnicodeDecodeError:
msg_str = msg_bytes
log.exception(
'Failed to decode msg?\n'
f'{codec}\n\n'
'Rxed bytes from wire:\n\n'
f'{msg_str!r}\n'
)
decodes_failed += 1
else:
raise
async def send(
self,
msg: msgtypes.MsgType,
strict_types: bool = True,
hide_tb: bool = True,
) -> None:
'''
Send a msgpack encoded py-object-blob-as-msg over TCP.
If `strict_types == True` then a `MsgTypeError` will be raised on any
invalid msg type
'''
__tracebackhide__: bool = hide_tb
# XXX see `trio._sync.AsyncContextManagerMixin` for details
# on the `.acquire()`/`.release()` sequencing..
async with self._send_lock:
# NOTE: lookup the `trio.Task.context`'s var for
# the current `MsgCodec`.
codec: MsgCodec = _ctxvar_MsgCodec.get()
# XXX for ctxvar debug only!
# if self._codec.pld_spec != codec.pld_spec:
# self._codec = codec
# log.runtime(
# f'Using new codec in {self}.send()\n'
# f'codec: {self._codec}\n\n'
# f'msg: {msg}\n'
# )
if type(msg) not in msgtypes.__msg_types__:
if strict_types:
raise _mk_send_mte(
msg,
codec=codec,
)
else:
log.warning(
'Sending non-`Msg`-spec msg?\n\n'
f'{msg}\n'
)
try:
bytes_data: bytes = codec.encode(msg)
except TypeError as _err:
typerr = _err
msgtyperr: MsgTypeError = _mk_send_mte(
msg,
codec=codec,
message=(
f'IPC-msg-spec violation in\n\n'
f'{pretty_struct.Struct.pformat(msg)}'
),
src_type_error=typerr,
)
raise msgtyperr from typerr
# supposedly the fastest says,
# https://stackoverflow.com/a/54027962
size: bytes = struct.pack("<I", len(bytes_data))
try:
return await self.stream.send_all(size + bytes_data)
except (
trio.BrokenResourceError,
trio.ClosedResourceError,
) as _re:
trans_err = _re
tpt_name: str = f'{type(self).__name__!r}'
match trans_err:
# XXX, specifc to UDS transport and its,
# well, "speediness".. XD
# |_ likely todo with races related to how fast
# the socket is setup/torn-down on linux
# as it pertains to rando pings from the
# `.discovery` subsys and protos.
case trio.BrokenResourceError() if (
'[Errno 32] Broken pipe'
in
trans_err.args[0]
):
tpt_closed = TransportClosed.from_src_exc(
message=(
f'{tpt_name} already closed by peer\n'
),
body=f'{self}\n',
src_exc=trans_err,
raise_on_report=True,
loglevel='transport',
)
raise tpt_closed from trans_err
# case trio.ClosedResourceError() if (
# 'this socket was already closed'
# in
# trans_err.args[0]
# ):
# tpt_closed = TransportClosed.from_src_exc(
# message=(
# f'{tpt_name} already closed by peer\n'
# ),
# body=f'{self}\n',
# src_exc=trans_err,
# raise_on_report=True,
# loglevel='transport',
# )
# raise tpt_closed from trans_err
# unless the disconnect condition falls under "a
# normal operation breakage" we usualy console warn
# about it.
case _:
log.exception(
f'{tpt_name} layer failed pre-send ??\n'
)
raise trans_err
# ?TODO? does it help ever to dynamically show this
# frame?
# try:
# <the-above_code>
# except BaseException as _err:
# err = _err
# if not isinstance(err, MsgTypeError):
# __tracebackhide__: bool = False
# raise
async def recv(self) -> msgtypes.MsgType:
return await self._aiter_pkts.asend(None)
async def drain(self) -> AsyncIterator[dict]:
'''
Drain the stream's remaining messages sent from
the far end until the connection is closed by
the peer.
'''
try:
async for msg in self._iter_packets():
self.drained.append(msg)
except TransportClosed:
for msg in self.drained:
yield msg
def __aiter__(self):
return self._aiter_pkts
@property
def laddr(self) -> Address:
return self._laddr
@property
def raddr(self) -> Address:
return self._raddr
def pformat(self) -> str:
return (
f'<{type(self).__name__}(\n'
f' |_peers: 1\n'
f' laddr: {self._laddr}\n'
f' raddr: {self._raddr}\n'
# f'\n'
f' |_task: {self._task}\n'
f')>\n'
)
__repr__ = __str__ = pformat

View File

@ -0,0 +1,123 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
IPC subsys type-lookup helpers?
'''
from typing import (
Type,
# TYPE_CHECKING,
)
import trio
import socket
from tractor.ipc._transport import (
MsgTransportKey,
MsgTransport
)
from tractor.ipc._tcp import (
TCPAddress,
MsgpackTCPStream,
)
from tractor.ipc._uds import (
UDSAddress,
MsgpackUDSStream,
)
# if TYPE_CHECKING:
# from tractor._addr import Address
Address = TCPAddress|UDSAddress
# manually updated list of all supported msg transport types
_msg_transports = [
MsgpackTCPStream,
MsgpackUDSStream
]
# convert a MsgTransportKey to the corresponding transport type
_key_to_transport: dict[
MsgTransportKey,
Type[MsgTransport],
] = {
('msgpack', 'tcp'): MsgpackTCPStream,
('msgpack', 'uds'): MsgpackUDSStream,
}
# convert an Address wrapper to its corresponding transport type
_addr_to_transport: dict[
Type[TCPAddress|UDSAddress],
Type[MsgTransport]
] = {
TCPAddress: MsgpackTCPStream,
UDSAddress: MsgpackUDSStream,
}
def transport_from_addr(
addr: Address,
codec_key: str = 'msgpack',
) -> Type[MsgTransport]:
'''
Given a destination address and a desired codec, find the
corresponding `MsgTransport` type.
'''
try:
return _addr_to_transport[type(addr)]
except KeyError:
raise NotImplementedError(
f'No known transport for address {repr(addr)}'
)
def transport_from_stream(
stream: trio.abc.Stream,
codec_key: str = 'msgpack'
) -> Type[MsgTransport]:
'''
Given an arbitrary `trio.abc.Stream` and a desired codec,
find the corresponding `MsgTransport` type.
'''
transport = None
if isinstance(stream, trio.SocketStream):
sock: socket.socket = stream.socket
match sock.family:
case socket.AF_INET | socket.AF_INET6:
transport = 'tcp'
case socket.AF_UNIX:
transport = 'uds'
case _:
raise NotImplementedError(
f'Unsupported socket family: {sock.family}'
)
if not transport:
raise NotImplementedError(
f'Could not figure out transport type for stream type {type(stream)}'
)
key = (codec_key, transport)
return _key_to_transport[key]

458
tractor/ipc/_uds.py 100644
View File

@ -0,0 +1,458 @@
# tractor: structured concurrent "actors".
# Copyright 2018-eternity Tyler Goodlet.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
Unix Domain Socket implementation of tractor.ipc._transport.MsgTransport protocol
'''
from __future__ import annotations
from contextlib import (
contextmanager as cm,
)
from pathlib import Path
import os
from socket import (
AF_UNIX,
SOCK_STREAM,
SO_PASSCRED,
SO_PEERCRED,
SOL_SOCKET,
)
import struct
from typing import (
Type,
TYPE_CHECKING,
ClassVar,
)
import msgspec
import trio
from trio import (
socket,
SocketListener,
)
from trio._highlevel_open_unix_stream import (
close_on_error,
has_unix,
)
from tractor.msg import MsgCodec
from tractor.log import get_logger
from tractor.ipc._transport import (
MsgpackTransport,
)
from .._state import (
get_rt_dir,
current_actor,
is_root_process,
)
if TYPE_CHECKING:
from ._runtime import Actor
log = get_logger(__name__)
def unwrap_sockpath(
sockpath: Path,
) -> tuple[Path, Path]:
return (
sockpath.parent,
sockpath.name,
)
class UDSAddress(
msgspec.Struct,
frozen=True,
):
filedir: str|Path|None
filename: str|Path
maybe_pid: int|None = None
# TODO, maybe we should use better field and value
# -[x] really this is a `.protocol_key` not a "name" of anything.
# -[ ] consider a 'unix' proto-key instead?
# -[ ] need to check what other mult-transport frameworks do
# like zmq, nng, uri-spec et al!
proto_key: ClassVar[str] = 'uds'
unwrapped_type: ClassVar[type] = tuple[str, int]
def_bindspace: ClassVar[Path] = get_rt_dir()
@property
def bindspace(self) -> Path:
'''
We replicate the "ip-set-of-hosts" part of a UDS socket as
just the sub-directory in which we allocate socket files.
'''
return (
self.filedir
or
self.def_bindspace
)
@property
def sockpath(self) -> Path:
return self.bindspace / self.filename
@property
def is_valid(self) -> bool:
'''
We block socket files not allocated under the runtime subdir.
'''
return self.bindspace in self.sockpath.parents
@classmethod
def from_addr(
cls,
addr: (
tuple[Path|str, Path|str]|Path|str
),
) -> UDSAddress:
match addr:
case tuple()|list():
filedir = Path(addr[0])
filename = Path(addr[1])
return UDSAddress(
filedir=filedir,
filename=filename,
# maybe_pid=pid,
)
# NOTE, in case we ever decide to just `.unwrap()`
# to a `Path|str`?
case str()|Path():
sockpath: Path = Path(addr)
return UDSAddress(*unwrap_sockpath(sockpath))
case _:
# import pdbp; pdbp.set_trace()
raise TypeError(
f'Bad unwrapped-address for {cls} !\n'
f'{addr!r}\n'
)
def unwrap(self) -> tuple[str, int]:
# XXX NOTE, since this gets passed DIRECTLY to
# `.ipc._uds.open_unix_socket_w_passcred()`
return (
str(self.filedir),
str(self.filename),
)
@classmethod
def get_random(
cls,
bindspace: Path|None = None, # default netns
) -> UDSAddress:
filedir: Path = bindspace or cls.def_bindspace
pid: int = os.getpid()
actor: Actor|None = current_actor(
err_on_no_runtime=False,
)
if actor:
sockname: str = '::'.join(actor.uid) + f'@{pid}'
else:
prefix: str = '<unknown-actor>'
if is_root_process():
prefix: str = 'root'
sockname: str = f'{prefix}@{pid}'
sockpath: Path = Path(f'{sockname}.sock')
return UDSAddress(
filedir=filedir,
filename=sockpath,
maybe_pid=pid,
)
@classmethod
def get_root(cls) -> UDSAddress:
def_uds_filename: Path = 'registry@1616.sock'
return UDSAddress(
filedir=cls.def_bindspace,
filename=def_uds_filename,
# maybe_pid=1616,
)
# ?TODO, maybe we should just our .msg.pretty_struct.Struct` for
# this instead?
# -[ ] is it too "multi-line"y tho?
# the compact tuple/.unwrapped() form is simple enough?
#
def __repr__(self) -> str:
if not (pid := self.maybe_pid):
pid: str = '<unknown-peer-pid>'
body: str = (
f'({self.filedir}, {self.filename}, {pid})'
)
return (
f'{type(self).__name__}'
f'['
f'{body}'
f']'
)
@cm
def _reraise_as_connerr(
src_excs: tuple[Type[Exception]],
addr: UDSAddress,
):
try:
yield
except src_excs as src_exc:
raise ConnectionError(
f'Bad UDS socket-filepath-as-address ??\n'
f'{addr}\n'
f' |_sockpath: {addr.sockpath}\n'
f'\n'
f'from src: {src_exc!r}\n'
) from src_exc
async def start_listener(
addr: UDSAddress,
**kwargs,
) -> SocketListener:
'''
Start listening for inbound connections via
a `trio.SocketListener` (task) which `socket.bind()`s on `addr`.
Note, if the `UDSAddress.bindspace: Path` directory dne it is
implicitly created.
'''
sock = socket.socket(
socket.AF_UNIX,
socket.SOCK_STREAM
)
log.info(
f'Attempting to bind UDS socket\n'
f'>[\n'
f'|_{addr}\n'
)
# ?TODO? should we use the `actor.lifetime_stack`
# to rm on shutdown?
bindpath: Path = addr.sockpath
if not (bs := addr.bindspace).is_dir():
log.info(
'Creating bindspace dir in file-sys\n'
f'>{{\n'
f'|_{bs!r}\n'
)
bs.mkdir()
with _reraise_as_connerr(
src_excs=(
FileNotFoundError,
OSError,
),
addr=addr
):
await sock.bind(str(bindpath))
sock.listen(1)
log.info(
f'Listening on UDS socket\n'
f'[>\n'
f' |_{addr}\n'
)
return SocketListener(sock)
def close_listener(
addr: UDSAddress,
lstnr: SocketListener,
) -> None:
'''
Close and remove the listening unix socket's path.
'''
lstnr.socket.close()
os.unlink(addr.sockpath)
async def open_unix_socket_w_passcred(
filename: str|bytes|os.PathLike[str]|os.PathLike[bytes],
) -> trio.SocketStream:
'''
Literally the exact same as `trio.open_unix_socket()` except we set the additiona
`socket.SO_PASSCRED` option to ensure the server side (the process calling `accept()`)
can extract the connecting peer's credentials, namely OS specific process
related IDs.
See this SO for "why" the extra opts,
- https://stackoverflow.com/a/7982749
'''
if not has_unix:
raise RuntimeError("Unix sockets are not supported on this platform")
# much more simplified logic vs tcp sockets - one socket type and only one
# possible location to connect to
sock = trio.socket.socket(AF_UNIX, SOCK_STREAM)
sock.setsockopt(SOL_SOCKET, SO_PASSCRED, 1)
with close_on_error(sock):
await sock.connect(os.fspath(filename))
return trio.SocketStream(sock)
def get_peer_info(sock: trio.socket.socket) -> tuple[
int, # pid
int, # uid
int, # guid
]:
'''
Deliver the connecting peer's "credentials"-info as defined in
a very Linux specific way..
For more deats see,
- `man accept`,
- `man unix`,
this great online guide to all things sockets,
- https://beej.us/guide/bgnet/html/split-wide/man-pages.html#setsockoptman
AND this **wonderful SO answer**
- https://stackoverflow.com/a/7982749
'''
creds: bytes = sock.getsockopt(
SOL_SOCKET,
SO_PEERCRED,
struct.calcsize('3i')
)
# i.e a tuple of the fields,
# pid: int, "process"
# uid: int, "user"
# gid: int, "group"
return struct.unpack('3i', creds)
class MsgpackUDSStream(MsgpackTransport):
'''
A `trio.SocketStream` around a Unix-Domain-Socket transport
delivering `msgpack` encoded msgs using the `msgspec` codec lib.
'''
address_type = UDSAddress
layer_key: int = 4
@property
def maddr(self) -> str:
if not self.raddr:
return '<unknown-peer>'
filepath: Path = Path(self.raddr.unwrap()[0])
return (
f'/{self.address_type.proto_key}/{filepath}'
# f'/{self.chan.uid[0]}'
# f'/{self.cid}'
# f'/cid={cid_head}..{cid_tail}'
# TODO: ? not use this ^ right ?
)
def connected(self) -> bool:
return self.stream.socket.fileno() != -1
@classmethod
async def connect_to(
cls,
addr: UDSAddress,
prefix_size: int = 4,
codec: MsgCodec|None = None,
**kwargs
) -> MsgpackUDSStream:
sockpath: Path = addr.sockpath
#
# ^XXX NOTE, we don't provide any out-of-band `.pid` info
# (like, over the socket as extra msgs) since the (augmented)
# `.setsockopt()` call tells the OS provide it; the client
# pid can then be read on server/listen() side via
# `get_peer_info()` above.
with _reraise_as_connerr(
src_excs=(
FileNotFoundError,
),
addr=addr
):
stream = await open_unix_socket_w_passcred(
str(sockpath),
**kwargs
)
tpt_stream = MsgpackUDSStream(
stream,
prefix_size=prefix_size,
codec=codec
)
# XXX assign from new addrs after peer-PID extract!
(
tpt_stream._laddr,
tpt_stream._raddr,
) = cls.get_stream_addrs(stream)
return tpt_stream
@classmethod
def get_stream_addrs(
cls,
stream: trio.SocketStream
) -> tuple[
Path,
int,
]:
sock: trio.socket.socket = stream.socket
# NOTE XXX, it's unclear why one or the other ends up being
# `bytes` versus the socket-file-path, i presume it's
# something to do with who is the server (called `.listen()`)?
# maybe could be better implemented using another info-query
# on the socket like,
# https://beej.us/guide/bgnet/html/split-wide/system-calls-or-bust.html#gethostnamewho-am-i
sockname: str|bytes = sock.getsockname()
# https://beej.us/guide/bgnet/html/split-wide/system-calls-or-bust.html#getpeernamewho-are-you
peername: str|bytes = sock.getpeername()
match (peername, sockname):
case (str(), bytes()):
sock_path: Path = Path(peername)
case (bytes(), str()):
sock_path: Path = Path(sockname)
(
peer_pid,
_,
_,
) = get_peer_info(sock)
filedir, filename = unwrap_sockpath(sock_path)
laddr = UDSAddress(
filedir=filedir,
filename=filename,
maybe_pid=os.getpid(),
)
raddr = UDSAddress(
filedir=filedir,
filename=filename,
maybe_pid=peer_pid
)
return (laddr, raddr)

View File

@ -81,10 +81,35 @@ BOLD_PALETTE = {
}
def at_least_level(
log: Logger|LoggerAdapter,
level: int|str,
) -> bool:
'''
Predicate to test if a given level is active.
'''
if isinstance(level, str):
level: int = CUSTOM_LEVELS[level.upper()]
if log.getEffectiveLevel() <= level:
return True
return False
# TODO: this isn't showing the correct '{filename}'
# as it did before..
class StackLevelAdapter(LoggerAdapter):
def at_least_level(
self,
level: str,
) -> bool:
return at_least_level(
log=self,
level=level,
)
def transport(
self,
msg: str,
@ -92,7 +117,7 @@ class StackLevelAdapter(LoggerAdapter):
) -> None:
'''
IPC transport level msg IO; generally anything below
`._ipc.Channel` and friends.
`.ipc.Channel` and friends.
'''
return self.log(5, msg)
@ -270,7 +295,9 @@ def get_logger(
subsys_spec: str|None = None,
) -> StackLevelAdapter:
'''Return the package log or a sub-logger for ``name`` if provided.
'''
Return the `tractor`-library root logger or a sub-logger for
`name` if provided.
'''
log: Logger
@ -282,10 +309,10 @@ def get_logger(
name != _proj_name
):
# NOTE: for handling for modules that use ``get_logger(__name__)``
# NOTE: for handling for modules that use `get_logger(__name__)`
# we make the following stylistic choice:
# - always avoid duplicate project-package token
# in msg output: i.e. tractor.tractor _ipc.py in header
# in msg output: i.e. tractor.tractor.ipc._chan.py in header
# looks ridiculous XD
# - never show the leaf module name in the {name} part
# since in python the {filename} is always this same
@ -331,7 +358,7 @@ def get_logger(
def get_console_log(
level: str|None = None,
logger: Logger|None = None,
logger: Logger|StackLevelAdapter|None = None,
**kwargs,
) -> LoggerAdapter:
@ -344,12 +371,23 @@ def get_console_log(
Yeah yeah, i know we can use `logging.config.dictConfig()`. You do it.
'''
log = get_logger(
logger=logger,
**kwargs
) # set a root logger
logger: Logger = log.logger
# get/create a stack-aware-adapter
if (
logger
and
isinstance(logger, StackLevelAdapter)
):
# XXX, for ex. when passed in by a caller wrapping some
# other lib's logger instance with our level-adapter.
log = logger
else:
log: StackLevelAdapter = get_logger(
logger=logger,
**kwargs
)
logger: Logger|StackLevelAdapter = log.logger
if not level:
return log
@ -367,10 +405,7 @@ def get_console_log(
None,
)
):
fmt = LOG_FORMAT
# if logger:
# fmt = None
fmt: str = LOG_FORMAT # always apply our format?
handler = StreamHandler()
formatter = colorlog.ColoredFormatter(
fmt=fmt,
@ -391,19 +426,3 @@ def get_loglevel() -> str:
# global module logger for tractor itself
log: StackLevelAdapter = get_logger('tractor')
def at_least_level(
log: Logger|LoggerAdapter,
level: int|str,
) -> bool:
'''
Predicate to test if a given level is active.
'''
if isinstance(level, str):
level: int = CUSTOM_LEVELS[level.upper()]
if log.getEffectiveLevel() <= level:
return True
return False

View File

@ -210,12 +210,14 @@ class PldRx(Struct):
match msg:
case Return()|Error():
log.runtime(
f'Rxed final outcome msg\n'
f'Rxed final-outcome msg\n'
f'\n'
f'{msg}\n'
)
case Stop():
log.runtime(
f'Rxed stream stopped msg\n'
f'\n'
f'{msg}\n'
)
if passthrough_non_pld_msgs:
@ -261,8 +263,9 @@ class PldRx(Struct):
if (
type(msg) is Return
):
log.info(
log.runtime(
f'Rxed final result msg\n'
f'\n'
f'{msg}\n'
)
return self.decode_pld(
@ -304,10 +307,13 @@ class PldRx(Struct):
try:
pld: PayloadT = self._pld_dec.decode(pld)
log.runtime(
'Decoded msg payload\n\n'
f'Decoded payload for\n'
# f'\n'
f'{msg}\n'
f'where payload decoded as\n'
f'|_pld={pld!r}\n'
# ^TODO?, ideally just render with `,
# pld={decode}` in the `msg.pformat()`??
f'where, '
f'{type(msg).__name__}.pld={pld!r}\n'
)
return pld
except TypeError as typerr:
@ -494,7 +500,8 @@ def limit_plds(
finally:
log.runtime(
'Reverted to previous payload-decoder\n\n'
f'Reverted to previous payload-decoder\n'
f'\n'
f'{orig_pldec}\n'
)
# sanity on orig settings
@ -608,7 +615,7 @@ async def drain_to_final_msg(
#
# -[ ] make sure pause points work here for REPLing
# the runtime itself; i.e. ensure there's no hangs!
# |_from tractor.devx._debug import pause
# |_from tractor.devx.debug import pause
# await pause()
# NOTE: we get here if the far end was
@ -629,7 +636,8 @@ async def drain_to_final_msg(
(local_cs := rent_n.cancel_scope).cancel_called
):
log.cancel(
'RPC-ctx cancelled by local-parent scope during drain!\n\n'
f'RPC-ctx cancelled by local-parent scope during drain!\n'
f'\n'
f'c}}>\n'
f' |_{rent_n}\n'
f' |_.cancel_scope = {local_cs}\n'
@ -663,7 +671,8 @@ async def drain_to_final_msg(
# final result arrived!
case Return():
log.runtime(
'Context delivered final draining msg:\n'
f'Context delivered final draining msg\n'
f'\n'
f'{pretty_struct.pformat(msg)}'
)
ctx._result: Any = pld
@ -697,12 +706,14 @@ async def drain_to_final_msg(
):
log.cancel(
'Cancelling `MsgStream` drain since '
f'{reason}\n\n'
f'{reason}\n'
f'\n'
f'<= {ctx.chan.uid}\n'
f' |_{ctx._nsf}()\n\n'
f' |_{ctx._nsf}()\n'
f'\n'
f'=> {ctx._task}\n'
f' |_{ctx._stream}\n\n'
f' |_{ctx._stream}\n'
f'\n'
f'{pretty_struct.pformat(msg)}\n'
)
break
@ -739,7 +750,8 @@ async def drain_to_final_msg(
case Stop():
pre_result_drained.append(msg)
log.runtime( # normal/expected shutdown transaction
'Remote stream terminated due to "stop" msg:\n\n'
f'Remote stream terminated due to "stop" msg\n'
f'\n'
f'{pretty_struct.pformat(msg)}\n'
)
continue
@ -814,7 +826,8 @@ async def drain_to_final_msg(
else:
log.cancel(
'Skipping `MsgStream` drain since final outcome is set\n\n'
f'Skipping `MsgStream` drain since final outcome is set\n'
f'\n'
f'{ctx.outcome}\n'
)

View File

@ -20,6 +20,7 @@ Prettified version of `msgspec.Struct` for easier console grokin.
'''
from __future__ import annotations
from collections import UserList
import textwrap
from typing import (
Any,
Iterator,
@ -105,27 +106,11 @@ def iter_fields(struct: Struct) -> Iterator[
)
def pformat(
def iter_struct_ppfmt_lines(
struct: Struct,
field_indent: int = 2,
indent: int = 0,
field_indent: int = 0,
) -> Iterator[tuple[str, str]]:
) -> str:
'''
Recursion-safe `pprint.pformat()` style formatting of
a `msgspec.Struct` for sane reading by a human using a REPL.
'''
# global whitespace indent
ws: str = ' '*indent
# field whitespace indent
field_ws: str = ' '*(field_indent + indent)
# qtn: str = ws + struct.__class__.__qualname__
qtn: str = struct.__class__.__qualname__
obj_str: str = '' # accumulator
fi: structs.FieldInfo
k: str
v: Any
@ -135,15 +120,18 @@ def pformat(
# ..]` over .__name__ == `Literal` but still get only the
# latter for simple types like `str | int | None` etc..?
ft: type = fi.type
typ_name: str = getattr(ft, '__name__', str(ft))
typ_name: str = getattr(
ft,
'__name__',
str(ft)
).replace(' ', '')
# recurse to get sub-struct's `.pformat()` output Bo
if isinstance(v, Struct):
val_str: str = v.pformat(
indent=field_indent + indent,
field_indent=indent + field_indent,
yield from iter_struct_ppfmt_lines(
struct=v,
field_indent=field_indent+field_indent,
)
else:
val_str: str = repr(v)
@ -161,8 +149,39 @@ def pformat(
# raise
# return _Struct.__repr__(struct)
# TODO: LOLOL use `textwrap.indent()` instead dawwwwwg!
obj_str += (field_ws + f'{k}: {typ_name} = {val_str},\n')
yield (
' '*field_indent, # indented ws prefix
f'{k}: {typ_name} = {val_str},', # field's repr line content
)
def pformat(
struct: Struct,
field_indent: int = 2,
indent: int = 0,
) -> str:
'''
Recursion-safe `pprint.pformat()` style formatting of
a `msgspec.Struct` for sane reading by a human using a REPL.
'''
obj_str: str = '' # accumulator
for prefix, field_repr, in iter_struct_ppfmt_lines(
struct,
field_indent=field_indent,
):
obj_str += f'{prefix}{field_repr}\n'
# global whitespace indent
ws: str = ' '*indent
if indent:
obj_str: str = textwrap.indent(
text=obj_str,
prefix=ws,
)
# qtn: str = ws + struct.__class__.__qualname__
qtn: str = struct.__class__.__qualname__
return (
f'{qtn}(\n'

View File

@ -31,6 +31,7 @@ from typing import (
Type,
TypeVar,
TypeAlias,
# TYPE_CHECKING,
Union,
)
@ -47,6 +48,7 @@ from tractor.msg import (
pretty_struct,
)
from tractor.log import get_logger
# from tractor._addr import UnwrappedAddress
log = get_logger('tractor.msgspec')
@ -141,9 +143,49 @@ class Aid(
'''
name: str
uuid: str
# TODO: use built-in support for UUIDs?
# -[ ] `uuid.UUID` which has multi-protocol support
# https://jcristharif.com/msgspec/supported-types.html#uuid
pid: int|None = None
# TODO? can/should we extend this field set?
# -[ ] use built-in support for UUIDs? `uuid.UUID` which has
# multi-protocol support
# https://jcristharif.com/msgspec/supported-types.html#uuid
#
# -[ ] as per the `.ipc._uds` / `._addr` comments, maybe we
# should also include at least `.pid` (equiv to port for tcp)
# and/or host-part always?
@property
def uid(self) -> tuple[str, str]:
'''
Legacy actor "unique-id" pair format.
'''
return (
self.name,
self.uuid,
)
def reprol(
self,
sin_uuid: bool = True,
) -> str:
if not sin_uuid:
return (
f'{self.name}[{self.uuid[:6]}]@{self.pid!r}'
)
return (
f'{self.name}@{self.pid!r}'
)
# mk hashable via `.uuid`
def __hash__(self) -> int:
return hash(self.uuid)
def __eq__(self, other: Aid) -> bool:
return self.uuid == other.uuid
# use pretty fmt since often repr-ed for console/log
__repr__ = pretty_struct.Struct.__repr__
class SpawnSpec(
@ -161,14 +203,15 @@ class SpawnSpec(
# a hard `Struct` def for all of these fields!
_parent_main_data: dict
_runtime_vars: dict[str, Any]
# ^NOTE see `._state._runtime_vars: dict`
# module import capability
enable_modules: dict[str, str]
# TODO: not just sockaddr pairs?
# -[ ] abstract into a `TransportAddr` type?
reg_addrs: list[tuple[str, int]]
bind_addrs: list[tuple[str, int]]
reg_addrs: list[tuple[str, str|int]]
bind_addrs: list[tuple[str, str|int]]|None
# TODO: caps based RPC support in the payload?

View File

@ -38,7 +38,6 @@ from typing import (
import tractor
from tractor._exceptions import (
InternalError,
is_multi_cancelled,
TrioTaskExited,
TrioCancelled,
AsyncioTaskExited,
@ -49,7 +48,7 @@ from tractor._state import (
_runtime_vars,
)
from tractor._context import Unresolved
from tractor.devx import _debug
from tractor.devx import debug
from tractor.log import (
get_logger,
StackLevelAdapter,
@ -59,6 +58,9 @@ from tractor.log import (
# from tractor.msg import (
# pretty_struct,
# )
from tractor.trionics import (
is_multi_cancelled,
)
from tractor.trionics._broadcast import (
broadcast_receiver,
BroadcastReceiver,
@ -128,6 +130,7 @@ class LinkedTaskChannel(
_trio_task: trio.Task
_aio_task_complete: trio.Event
_closed_by_aio_task: bool = False
_suppress_graceful_exits: bool = True
_trio_err: BaseException|None = None
@ -206,10 +209,15 @@ class LinkedTaskChannel(
async def aclose(self) -> None:
await self._from_aio.aclose()
def started(
# ?TODO? async version of this?
def started_nowait(
self,
val: Any = None,
) -> None:
'''
Synchronize aio-side with its trio-parent.
'''
self._aio_started_val = val
return self._to_trio.send_nowait(val)
@ -240,6 +248,7 @@ class LinkedTaskChannel(
# cycle on the trio side?
# await trio.lowlevel.checkpoint()
return await self._from_aio.receive()
except BaseException as err:
async with translate_aio_errors(
chan=self,
@ -317,7 +326,7 @@ def _run_asyncio_task(
qsize: int = 1,
provide_channels: bool = False,
suppress_graceful_exits: bool = True,
hide_tb: bool = False,
hide_tb: bool = True,
**kwargs,
) -> LinkedTaskChannel:
@ -345,18 +354,6 @@ def _run_asyncio_task(
# value otherwise it would just return ;P
assert qsize > 1
if provide_channels:
assert 'to_trio' in args
# allow target func to accept/stream results manually by name
if 'to_trio' in args:
kwargs['to_trio'] = to_trio
if 'from_trio' in args:
kwargs['from_trio'] = from_trio
coro = func(**kwargs)
trio_task: trio.Task = trio.lowlevel.current_task()
trio_cs = trio.CancelScope()
aio_task_complete = trio.Event()
@ -371,6 +368,25 @@ def _run_asyncio_task(
_suppress_graceful_exits=suppress_graceful_exits,
)
# allow target func to accept/stream results manually by name
if 'to_trio' in args:
kwargs['to_trio'] = to_trio
if 'from_trio' in args:
kwargs['from_trio'] = from_trio
if 'chan' in args:
kwargs['chan'] = chan
if provide_channels:
assert (
'to_trio' in args
or
'chan' in args
)
coro = func(**kwargs)
async def wait_on_coro_final_result(
to_trio: trio.MemorySendChannel,
coro: Awaitable,
@ -443,9 +459,23 @@ def _run_asyncio_task(
f'Task exited with final result: {result!r}\n'
)
# only close the sender side which will relay
# a `trio.EndOfChannel` to the trio (consumer) side.
# XXX ALWAYS close the child-`asyncio`-task-side's
# `to_trio` handle which will in turn relay
# a `trio.EndOfChannel` to the `trio`-parent.
# Consequently the parent `trio` task MUST ALWAYS
# check for any `chan._aio_err` to be raised when it
# receives an EoC.
#
# NOTE, there are 2 EoC cases,
# - normal/graceful EoC due to the aio-side actually
# terminating its "streaming", but the task did not
# error and is not yet complete.
#
# - the aio-task terminated and we specially mark the
# closure as due to the `asyncio.Task`'s exit.
#
to_trio.close()
chan._closed_by_aio_task = True
aio_task_complete.set()
log.runtime(
@ -479,12 +509,12 @@ def _run_asyncio_task(
if (
debug_mode()
and
(greenback := _debug.maybe_import_greenback(
(greenback := debug.maybe_import_greenback(
force_reload=True,
raise_not_found=False,
))
):
log.info(
log.devx(
f'Bestowing `greenback` portal for `asyncio`-task\n'
f'{task}\n'
)
@ -643,8 +673,9 @@ def _run_asyncio_task(
not trio_cs.cancel_called
):
log.cancel(
f'Cancelling `trio` side due to aio-side src exc\n'
f'{curr_aio_err}\n'
f'Cancelling trio-side due to aio-side src exc\n'
f'\n'
f'{curr_aio_err!r}\n'
f'\n'
f'(c>\n'
f' |_{trio_task}\n'
@ -756,6 +787,7 @@ async def translate_aio_errors(
aio_done_before_trio: bool = aio_task.done()
assert aio_task
trio_err: BaseException|None = None
eoc: trio.EndOfChannel|None = None
try:
yield # back to one of the cross-loop apis
except trio.Cancelled as taskc:
@ -787,12 +819,48 @@ async def translate_aio_errors(
# )
# raise
# XXX always passthrough EoC since this translator is often
# called from `LinkedTaskChannel.receive()` which we want
# passthrough and further we have no special meaning for it in
# terms of relaying errors or signals from the aio side!
except trio.EndOfChannel as eoc:
trio_err = chan._trio_err = eoc
# XXX EoC is a special SIGNAL from the aio-side here!
# There are 2 cases to handle:
# 1. the "EoC passthrough" case.
# - the aio-task actually closed the channel "gracefully" and
# the trio-task should unwind any ongoing channel
# iteration/receiving,
# |_this exc-translator wraps calls to `LinkedTaskChannel.receive()`
# in which case we want to relay the actual "end-of-chan" for
# iteration purposes.
#
# 2. relaying the "asyncio.Task termination" case.
# - if the aio-task terminates, maybe with an error, AND the
# `open_channel_from()` API was used, it will always signal
# that termination.
# |_`wait_on_coro_final_result()` always calls
# `to_trio.close()` when `provide_channels=True` so we need to
# always check if there is an aio-side exc which needs to be
# relayed to the parent trio side!
# |_in this case the special `chan._closed_by_aio_task` is
# ALWAYS set.
#
except trio.EndOfChannel as _eoc:
eoc = _eoc
if (
chan._closed_by_aio_task
and
aio_err
):
log.cancel(
f'The asyncio-child task terminated due to error\n'
f'{aio_err!r}\n'
)
chan._trio_to_raise = aio_err
trio_err = chan._trio_err = eoc
#
# ?TODO?, raise something like a,
# chan._trio_to_raise = AsyncioErrored()
# BUT, with the tb rewritten to reflect the underlying
# call stack?
else:
trio_err = chan._trio_err = eoc
raise eoc
# NOTE ALSO SEE the matching note in the `cancel_trio()` asyncio
@ -841,7 +909,7 @@ async def translate_aio_errors(
except BaseException as _trio_err:
trio_err = chan._trio_err = _trio_err
# await tractor.pause(shield=True) # workx!
entered: bool = await _debug._maybe_enter_pm(
entered: bool = await debug._maybe_enter_pm(
trio_err,
api_frame=inspect.currentframe(),
)
@ -1045,7 +1113,7 @@ async def translate_aio_errors(
#
if wait_on_aio_task:
await chan._aio_task_complete.wait()
log.info(
log.debug(
'asyncio-task is done and unblocked trio-side!\n'
)
@ -1062,11 +1130,17 @@ async def translate_aio_errors(
trio_to_raise: (
AsyncioCancelled|
AsyncioTaskExited|
Exception| # relayed from aio-task
None
) = chan._trio_to_raise
raise_from: Exception = (
trio_err if (aio_err is trio_to_raise)
else aio_err
)
if not suppress_graceful_exits:
raise trio_to_raise from (aio_err or trio_err)
raise trio_to_raise from raise_from
if trio_to_raise:
match (
@ -1099,7 +1173,7 @@ async def translate_aio_errors(
)
return
case _:
raise trio_to_raise from (aio_err or trio_err)
raise trio_to_raise from raise_from
# Check if the asyncio-side is the cause of the trio-side
# error.
@ -1165,7 +1239,6 @@ async def run_task(
@acm
async def open_channel_from(
target: Callable[..., Any],
suppress_graceful_exits: bool = True,
**target_kwargs,
@ -1199,7 +1272,6 @@ async def open_channel_from(
# deliver stream handle upward
yield first, chan
except trio.Cancelled as taskc:
# await tractor.pause(shield=True) # ya it worx ;)
if cs.cancel_called:
if isinstance(chan._trio_to_raise, AsyncioCancelled):
log.cancel(
@ -1406,7 +1478,7 @@ def run_as_asyncio_guest(
)
# XXX make it obvi we know this isn't supported yet!
assert 0
# await _debug.maybe_init_greenback(
# await debug.maybe_init_greenback(
# force_reload=True,
# )

View File

@ -21,7 +21,6 @@ Sugary patterns for trio + tractor designs.
from ._mngrs import (
gather_contexts as gather_contexts,
maybe_open_context as maybe_open_context,
maybe_open_nursery as maybe_open_nursery,
)
from ._broadcast import (
AsyncReceiver as AsyncReceiver,
@ -31,4 +30,12 @@ from ._broadcast import (
)
from ._beg import (
collapse_eg as collapse_eg,
get_collapsed_eg as get_collapsed_eg,
is_multi_cancelled as is_multi_cancelled,
)
from ._taskc import (
maybe_raise_from_masking_exc as maybe_raise_from_masking_exc,
)
from ._tn import (
maybe_open_nursery as maybe_open_nursery,
)

View File

@ -15,31 +15,94 @@
# along with this program. If not, see <https://www.gnu.org/licenses/>.
'''
`BaseExceptionGroup` related utils and helpers pertaining to
first-class-`trio` from a historical perspective B)
`BaseExceptionGroup` utils and helpers pertaining to
first-class-`trio` from a "historical" perspective, like "loose
exception group" task-nurseries.
'''
from contextlib import (
asynccontextmanager as acm,
)
from typing import (
Literal,
Type,
)
import trio
# from trio._core._concat_tb import (
# concat_tb,
# )
def maybe_collapse_eg(
beg: BaseExceptionGroup,
# XXX NOTE
# taken verbatim from `trio._core._run` except,
# - remove the NONSTRICT_EXCEPTIONGROUP_NOTE deprecation-note
# guard-check; we know we want an explicit collapse.
# - mask out tb rewriting in collapse case, i don't think it really
# matters?
#
def collapse_exception_group(
excgroup: BaseExceptionGroup[BaseException],
) -> BaseException:
"""Recursively collapse any single-exception groups into that single contained
exception.
"""
exceptions = list(excgroup.exceptions)
modified = False
for i, exc in enumerate(exceptions):
if isinstance(exc, BaseExceptionGroup):
new_exc = collapse_exception_group(exc)
if new_exc is not exc:
modified = True
exceptions[i] = new_exc
if (
len(exceptions) == 1
and isinstance(excgroup, BaseExceptionGroup)
# XXX trio's loose-setting condition..
# and NONSTRICT_EXCEPTIONGROUP_NOTE in getattr(excgroup, "__notes__", ())
):
# exceptions[0].__traceback__ = concat_tb(
# excgroup.__traceback__,
# exceptions[0].__traceback__,
# )
return exceptions[0]
elif modified:
return excgroup.derive(exceptions)
else:
return excgroup
def get_collapsed_eg(
beg: BaseExceptionGroup,
) -> BaseException|None:
'''
If the input beg can collapse to a single non-eg sub-exception,
return it instead.
If the input beg can collapse to a single sub-exception which is
itself **not** an eg, return it.
'''
if len(excs := beg.exceptions) == 1:
return excs[0]
maybe_exc = collapse_exception_group(beg)
if maybe_exc is beg:
return None
return beg
return maybe_exc
@acm
async def collapse_eg():
async def collapse_eg(
hide_tb: bool = True,
# XXX, for ex. will always show begs containing single taskc
ignore: set[Type[BaseException]] = {
# trio.Cancelled,
},
add_notes: bool = True,
bp: bool = False,
):
'''
If `BaseExceptionGroup` raised in the body scope is
"collapse-able" (in the same way that
@ -47,12 +110,114 @@ async def collapse_eg():
only raise the lone emedded non-eg in in place.
'''
__tracebackhide__: bool = hide_tb
try:
yield
except* BaseException as beg:
if (
exc := maybe_collapse_eg(beg)
) is not beg:
raise exc
except BaseExceptionGroup as _beg:
beg = _beg
raise beg
if (
bp
and
len(beg.exceptions) > 1
):
import tractor
if tractor.current_actor(
err_on_no_runtime=False,
):
await tractor.pause(shield=True)
else:
breakpoint()
if (
(exc := get_collapsed_eg(beg))
and
type(exc) not in ignore
):
# TODO? report number of nested groups it was collapsed
# *from*?
if add_notes:
from_group_note: str = (
'( ^^^ this exc was collapsed from a group ^^^ )\n'
)
if (
from_group_note
not in
getattr(exc, "__notes__", ())
):
exc.add_note(from_group_note)
# raise exc
# ^^ this will leave the orig beg tb above with the
# "during the handling of <beg> the following.."
# So, instead do..
#
if cause := exc.__cause__:
raise exc from cause
else:
# suppress "during handling of <the beg>"
# output in tb/console.
raise exc from None
# keep original
raise # beg
def is_multi_cancelled(
beg: BaseException|BaseExceptionGroup,
ignore_nested: set[BaseException] = set(),
) -> Literal[False]|BaseExceptionGroup:
'''
Predicate to determine if an `BaseExceptionGroup` only contains
some (maybe nested) set of sub-grouped exceptions (like only
`trio.Cancelled`s which get swallowed silently by default) and is
thus the result of "gracefully cancelling" a collection of
sub-tasks (or other conc primitives) and receiving a "cancelled
ACK" from each after termination.
Docs:
----
- https://docs.python.org/3/library/exceptions.html#exception-groups
- https://docs.python.org/3/library/exceptions.html#BaseExceptionGroup.subgroup
'''
if (
not ignore_nested
or
trio.Cancelled not in ignore_nested
# XXX always count-in `trio`'s native signal
):
ignore_nested.update({trio.Cancelled})
if isinstance(beg, BaseExceptionGroup):
# https://docs.python.org/3/library/exceptions.html#BaseExceptionGroup.subgroup
# |_ "The condition can be an exception type or tuple of
# exception types, in which case each exception is checked
# for a match using the same check that is used in an
# except clause. The condition can also be a callable
# (other than a type object) that accepts an exception as
# its single argument and returns true for the exceptions
# that should be in the subgroup."
matched_exc: BaseExceptionGroup|None = beg.subgroup(
tuple(ignore_nested),
# ??TODO, complain about why not allowed to use
# named arg style calling???
# XD .. wtf?
# condition=tuple(ignore_nested),
)
if matched_exc is not None:
return matched_exc
# NOTE, IFF no excs types match (throughout the error-tree)
# -> return `False`, OW return the matched sub-eg.
#
# IOW, for the inverse of ^ for the purpose of
# maybe-enter-REPL--logic: "only debug when the err-tree contains
# at least one exc-type NOT in `ignore_nested`" ; i.e. the case where
# we fallthrough and return `False` here.
return False

View File

@ -22,7 +22,7 @@ https://docs.rs/tokio/1.11.0/tokio/sync/broadcast/index.html
from __future__ import annotations
from abc import abstractmethod
from collections import deque
from contextlib import asynccontextmanager
from contextlib import asynccontextmanager as acm
from functools import partial
from operator import ne
from typing import (
@ -398,7 +398,7 @@ class BroadcastReceiver(ReceiveChannel):
return await self._receive_from_underlying(key, state)
@asynccontextmanager
@acm
async def subscribe(
self,
raise_on_lag: bool = True,

View File

@ -23,7 +23,6 @@ from contextlib import (
asynccontextmanager as acm,
)
import inspect
from types import ModuleType
from typing import (
Any,
AsyncContextManager,
@ -34,15 +33,17 @@ from typing import (
Optional,
Sequence,
TypeVar,
TYPE_CHECKING,
)
import trio
from tractor._state import current_actor
from tractor.log import get_logger
from ._tn import maybe_open_nursery
# from ._beg import collapse_eg
# from ._taskc import (
# maybe_raise_from_masking_exc,
# )
if TYPE_CHECKING:
from tractor import ActorNursery
log = get_logger(__name__)
@ -51,29 +52,6 @@ log = get_logger(__name__)
T = TypeVar("T")
@acm
async def maybe_open_nursery(
nursery: trio.Nursery|ActorNursery|None = None,
shield: bool = False,
lib: ModuleType = trio,
**kwargs, # proxy thru
) -> AsyncGenerator[trio.Nursery, Any]:
'''
Create a new nursery if None provided.
Blocks on exit as expected if no input nursery is provided.
'''
if nursery is not None:
yield nursery
else:
async with lib.open_nursery(**kwargs) as nursery:
nursery.cancel_scope.shield = shield
yield nursery
async def _enter_and_wait(
mngr: AsyncContextManager[T],
unwrapped: dict[int, T],
@ -103,6 +81,9 @@ async def _enter_and_wait(
async def gather_contexts(
mngrs: Sequence[AsyncContextManager[T]],
# caller can provide their own scope
tn: trio.Nursery|None = None,
) -> AsyncGenerator[
tuple[
T | None,
@ -111,17 +92,19 @@ async def gather_contexts(
None,
]:
'''
Concurrently enter a sequence of async context managers (acms),
each from a separate `trio` task and deliver the unwrapped
`yield`-ed values in the same order once all managers have entered.
Concurrently enter a sequence of async context managers (`acm`s),
each scheduled in a separate `trio.Task` and deliver their
unwrapped `yield`-ed values in the same order once all `@acm`s
in every task have entered.
On exit, all acms are subsequently and concurrently exited.
On exit, all `acm`s are subsequently and concurrently exited with
**no order guarantees**.
This function is somewhat similar to a batch of non-blocking
calls to `contextlib.AsyncExitStack.enter_async_context()`
(inside a loop) *in combo with* a `asyncio.gather()` to get the
`.__aenter__()`-ed values, except the managers are both
concurrently entered and exited and *cancellation just works*(R).
concurrently entered and exited and *cancellation-just-works*.
'''
seed: int = id(mngrs)
@ -141,37 +124,47 @@ async def gather_contexts(
if not mngrs:
raise ValueError(
'`.trionics.gather_contexts()` input mngrs is empty?\n'
'\n'
'Did try to use inline generator syntax?\n'
'Use a non-lazy iterator or sequence type intead!'
'Check that list({mngrs}) works!\n'
# 'or sequence-type intead!\n'
# 'Use a non-lazy iterator or sequence-type intead!\n'
)
async with trio.open_nursery(
strict_exception_groups=False,
# ^XXX^ TODO? soo roll our own then ??
# -> since we kinda want the "if only one `.exception` then
# just raise that" interface?
) as tn:
for mngr in mngrs:
tn.start_soon(
_enter_and_wait,
mngr,
unwrapped,
all_entered,
parent_exit,
seed,
)
try:
async with (
#
# ?TODO, does including these (eg-collapsing,
# taskc-unmasking) improve tb noise-reduction/legibility?
#
# collapse_eg(),
maybe_open_nursery(
nursery=tn,
) as tn,
# maybe_raise_from_masking_exc(),
):
for mngr in mngrs:
tn.start_soon(
_enter_and_wait,
mngr,
unwrapped,
all_entered,
parent_exit,
seed,
)
# deliver control once all managers have started up
await all_entered.wait()
try:
# deliver control to caller once all ctx-managers have
# started (yielded back to us).
await all_entered.wait()
yield tuple(unwrapped.values())
finally:
# NOTE: this is ABSOLUTELY REQUIRED to avoid
# the following wacky bug:
# <tractorbugurlhere>
parent_exit.set()
finally:
# XXX NOTE: this is ABSOLUTELY REQUIRED to avoid
# the following wacky bug:
# <tractorbugurlhere>
parent_exit.set()
# Per actor task caching helpers.
# Further potential examples of interest:
@ -183,7 +176,7 @@ class _Cache:
a kept-alive-while-in-use async resource.
'''
service_n: Optional[trio.Nursery] = None
service_tn: Optional[trio.Nursery] = None
locks: dict[Hashable, trio.Lock] = {}
users: int = 0
values: dict[Any, Any] = {}
@ -224,6 +217,9 @@ async def maybe_open_context(
kwargs: dict = {},
key: Hashable | Callable[..., Hashable] = None,
# caller can provide their own scope
tn: trio.Nursery|None = None,
) -> AsyncIterator[tuple[bool, T]]:
'''
Maybe open an async-context-manager (acm) if there is not already
@ -256,40 +252,94 @@ async def maybe_open_context(
# have it not be closed until all consumers have exited (which is
# currently difficult to implement any other way besides using our
# pre-allocated runtime instance..)
service_n: trio.Nursery = current_actor()._service_n
if tn:
# TODO, assert tn is eventual parent of this task!
task: trio.Task = trio.lowlevel.current_task()
task_tn: trio.Nursery = task.parent_nursery
if not tn._cancel_status.encloses(
task_tn._cancel_status
):
raise RuntimeError(
f'Mis-nesting of task under provided {tn} !?\n'
f'Current task is NOT a child(-ish)!!\n'
f'\n'
f'task: {task}\n'
f'task_tn: {task_tn}\n'
)
service_tn = tn
else:
service_tn: trio.Nursery = current_actor()._service_tn
# TODO: is there any way to allocate
# a 'stays-open-till-last-task-finshed nursery?
# service_n: trio.Nursery
# async with maybe_open_nursery(_Cache.service_n) as service_n:
# _Cache.service_n = service_n
# service_tn: trio.Nursery
# async with maybe_open_nursery(_Cache.service_tn) as service_tn:
# _Cache.service_tn = service_tn
cache_miss_ke: KeyError|None = None
maybe_taskc: trio.Cancelled|None = None
try:
# **critical section** that should prevent other tasks from
# checking the _Cache until complete otherwise the scheduler
# may switch and by accident we create more then one resource.
yielded = _Cache.values[ctx_key]
except KeyError:
log.debug(f'Allocating new {acm_func} for {ctx_key}')
mngr = acm_func(**kwargs)
resources = _Cache.resources
assert not resources.get(ctx_key), f'Resource exists? {ctx_key}'
resources[ctx_key] = (service_n, trio.Event())
except KeyError as _ke:
# XXX, stay mutexed up to cache-miss yield
try:
cache_miss_ke = _ke
log.debug(
f'Allocating new @acm-func entry\n'
f'ctx_key={ctx_key}\n'
f'acm_func={acm_func}\n'
)
mngr = acm_func(**kwargs)
resources = _Cache.resources
assert not resources.get(ctx_key), f'Resource exists? {ctx_key}'
resources[ctx_key] = (service_tn, trio.Event())
yielded: Any = await service_tn.start(
_Cache.run_ctx,
mngr,
ctx_key,
)
_Cache.users += 1
finally:
# XXX, since this runs from an `except` it's a checkpoint
# whih can be `trio.Cancelled`-masked.
#
# NOTE, in that case the mutex is never released by the
# (first and) caching task and **we can't** simply shield
# bc that will inf-block on the `await
# no_more_users.wait()`.
#
# SO just always unlock!
lock.release()
# sync up to the mngr's yielded value
yielded = await service_n.start(
_Cache.run_ctx,
mngr,
ctx_key,
)
_Cache.users += 1
lock.release()
yield False, yielded
try:
yield (
False, # cache_hit = "no"
yielded,
)
except trio.Cancelled as taskc:
maybe_taskc = taskc
log.cancel(
f'Cancelled from cache-miss entry\n'
f'\n'
f'ctx_key: {ctx_key!r}\n'
f'mngr: {mngr!r}\n'
)
# XXX, always unset ke from cancelled context
# since we never consider it a masked exc case!
# - bc this can be called directly ty `._rpc._invoke()`?
#
if maybe_taskc.__context__ is cache_miss_ke:
maybe_taskc.__context__ = None
raise taskc
else:
_Cache.users += 1
log.runtime(
log.debug(
f'Re-using cached resource for user {_Cache.users}\n\n'
f'{ctx_key!r} -> {type(yielded)}\n'
@ -299,9 +349,19 @@ async def maybe_open_context(
# f'{ctx_key!r} -> {yielded!r}\n'
)
lock.release()
yield True, yielded
yield (
True, # cache_hit = "yes"
yielded,
)
finally:
if lock.locked():
stats: trio.LockStatistics = lock.statistics()
log.error(
f'Lock left locked by last owner !?\n'
f'{stats}\n'
)
_Cache.users -= 1
if yielded is not None:

Some files were not shown because too many files have changed in this diff Show More