Compare commits

..

21 Commits

Author SHA1 Message Date
Tyler Goodlet 0cbf02bf2e Adjust `test_trio_prestarted_task_bubbles()` suite to expect non-eg raises 2025-08-16 17:20:36 -04:00
Tyler Goodlet 4a0b78d447 Styling tweaks to quadruple streaming test fn 2025-08-16 17:20:36 -04:00
Tyler Goodlet 718d0887c9 Resolve `test_cancel_while_childs_child_in_sync_sleep`
Was failing due to the `.fail_after()` timeout being *too short* and
somehow the new interplay of that with strict-exception groups resulting
in the `TooSlowError` never raising but instead an eg with the embedded
`AssertionError`?? I still don't really get it honestly..

I've written up lengthy notes around the different `delay` settings that
can be used to see the diff outcomes, the failing case being the one
i still don't really grok and think is justification for `trio` to
bubble inner `Cancelled`s differently possibly?

For now i've included the original failing case as an `xfail`
parametrization for now which will hopefully drive a follow lowlevel
`trio` test in `test_trioisms`!
2025-08-16 17:20:36 -04:00
Tyler Goodlet 40a40f613a Fix cluster suite, chng to new `gather_contexts()`
Namely `test_empty_mngrs_input_raises()` was failing due to
lazy-iterator use as input to `mngrs` which i guess i added support for
a while back (by it doing a `list(mngrs)` internally)? So just change it
to `gather_contexts(mngrs=())` and also tweak the `trio.fail_after(3)`
since it appears that the prior 1sec was causing
too-fast-of-a-cancellation (before the cluster fully spawned) and thus
the expected `ValueError` never to show..

Also, mask the `tractor.trionics.collapse_eg()` usage (again?) in
`open_actor_cluster()` since it seems unnecessary.
2025-08-16 17:20:36 -04:00
Tyler Goodlet c692b30e7f WIP tinkering with strict-eg-tns and cluster API
Seems that the way the actor-nursery interacts with the
`.trionics.gather_contexts()` API on cancellation makes our
`.trionics.collapse_eg()` not work as intended?

I need to dig into how `ActorNursery.cancel()` and `.__aexit__()` might
be causing this discrepancy..

Consider this a commit-of-my-index type save for rn.
2025-08-16 17:20:36 -04:00
Tyler Goodlet c91bf6dcc2 Bit of multi-line styling / name tweaks in cancellation suites 2025-08-16 17:20:36 -04:00
Tyler Goodlet 5afa9407c9 Mk temp collapser bp work outside runtime as well.. 2025-08-16 17:20:36 -04:00
Tyler Goodlet 9b1dd7b279 Add temp breakpoint support to `collapse_eg()` 2025-08-16 17:20:36 -04:00
Tyler Goodlet 910257bb46 Suppress beg tbs from `collapse_eg()`
It was originally this way; I forgot to flip it back when discarding the
`except*` handler impl..

Specially handle the `exc.__cause__` case where we raise from any
detected underlying cause and OW `from None` to suppress the eg's tb.
2025-08-16 17:20:36 -04:00
Tyler Goodlet ca37d8ed91 Rework `collapse_eg()` to NOT use `except*`..
Since it turns out the semantics are basically inverse of normal
`except` (particularly for re-raising) which is hard to get right, and
bc it's a lot easier to just delegate to what `trio` already has behind
the `strict_exception_groups=False` setting, Bp

I added a rant here which will get removed shortly likely, but i think
going forward recommending against use of `except*` is prudent for
anything low level enough in the runtime (like trying to filter begs).

Dirty deats,
- copy `trio._core._run.collapse_exception_group()` to here with only
  a slight mod to remove the notes check and tb concatting for the
  collapse case.
- rename `maybe_collapse_eg()` - > `get_collapsed_eg()` and delegate it
  directly to the former `trio` fn; return `None` when it returns the
  same beg without collapse.
- simplify our own `collapse_eg()` to either raise the collapsed `exc`
  or original `beg`.
2025-08-16 17:20:36 -04:00
Tyler Goodlet 072cdd0c66 Couple more `._root` logging tweaks.. 2025-08-16 17:20:36 -04:00
Tyler Goodlet 30302b051d Use collapser around `root_tn` in `async_main()`
Replacing yet another loose-eg-flag. Also toss in a todo to maybe use
the unmasker around the `open_root_actor()` body.
2025-08-16 17:20:36 -04:00
Tyler Goodlet bb04e55d5f Facepalm, fix `raise from` in `collapse_eg()`
I dunno what exactly I was thinking but we definitely don't want to
**ever** raise from the original exc-group, instead always raise from
any original `.__cause__` to be consistent with the embedded src-error's
context.

Also, adjust `maybe_collapse_eg()` to return `False` in the non-single
`.exceptions` case, again don't know what I was trying to do but this
simplifies caller logic and the prior return-semantic had no real
value..

This fixes some final usage in the runtime (namely top level nursery
usage in `._root`/`._runtime`) which was previously causing test suite
failures prior to this fix.
2025-08-16 17:20:36 -04:00
Tyler Goodlet 226d06dbfa Just import `._runtime` ns in `._root`; be a bit more explicit 2025-08-16 17:20:36 -04:00
Tyler Goodlet 1b502736d6 Use collapse in `._root.open_root_actor()` too
Seems to add one more cancellation suite failure as well as now cause
the discovery test to error instead of fail?
2025-08-16 17:20:36 -04:00
Tyler Goodlet 51992ef546 Use collapser around root tn in `.async_main()`
Seems to cause the following test suites to fail however..

- 'test_advanced_faults.py::test_ipc_channel_break_during_stream'
- 'test_advanced_faults.py::test_ipc_channel_break_during_stream'
- 'test_clustering.py::test_empty_mngrs_input_raises'

Also tweak some ctxc request logging content.
2025-08-16 17:20:36 -04:00
Tyler Goodlet 6da470918a Drop msging-err patt from `subactor_breakpoint` ex
Since the `bdb` module was added to the namespace lookup set in
`._exceptions.get_err_type()` we can now relay a RAE-boxed
`bdb.BdbQuit`.
2025-08-16 17:20:36 -04:00
Tyler Goodlet 93d97c3ff9 Switch to strict-eg nurseries almost everywhere
That is just throughout the core library, not the tests yet. Again, we
simply change over to using our (nearly equivalent?)
`.trionics.collapse_eg()` in place of the already deprecated
`strict_exception_groups=False` flag in the following internals,
- the conc-fan-out tn use in `._discovery.find_actor()`.
- `._portal.open_portal()`'s internal tn used to spawn a bg rpc-msg-loop
  task.
- the daemon and "run-in-actor" layered tn pair allocated in
  `._supervise._open_and_supervise_one_cancels_all_nursery()`.

The remaining loose-eg usage in `._root` and `._runtime` seem to be
necessary to keep the test suite green?? For the moment these are left
out.
2025-08-16 17:20:36 -04:00
Tyler Goodlet abaea4de8e Use collapser in rent side of `Context` 2025-08-16 17:20:36 -04:00
Tyler Goodlet 4b3ba35cd6 Add some tooling params to `collapse_eg()` 2025-08-16 17:20:36 -04:00
Tyler Goodlet 70a508e9d7 Add a "real-world" example of cancelled-masking with `.aclose()` 2025-08-16 17:20:36 -04:00
11 changed files with 187 additions and 46 deletions

View File

@ -16,7 +16,6 @@ from tractor import (
ContextCancelled,
MsgStream,
_testing,
trionics,
)
import trio
import pytest
@ -63,8 +62,9 @@ async def recv_and_spawn_net_killers(
await ctx.started()
async with (
ctx.open_stream() as stream,
trionics.collapse_eg(),
trio.open_nursery() as tn,
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
async for i in stream:
print(f'child echoing {i}')

View File

@ -23,8 +23,9 @@ async def main():
modules=[__name__]
) as portal_map,
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
for (name, portal) in portal_map.items():

View File

@ -0,0 +1,145 @@
from contextlib import (
contextmanager as cm,
# TODO, any diff in async case(s)??
# asynccontextmanager as acm,
)
from functools import partial
import tractor
import trio
log = tractor.log.get_logger(__name__)
tractor.log.get_console_log('info')
@cm
def teardown_on_exc(
raise_from_handler: bool = False,
):
'''
You could also have a teardown handler which catches any exc and
does some required teardown. In this case the problem is
compounded UNLESS you ensure the handler's scope is OUTSIDE the
`ux.aclose()`.. that is in the caller's enclosing scope.
'''
try:
yield
except BaseException as _berr:
berr = _berr
log.exception(
f'Handling termination teardown in child due to,\n'
f'{berr!r}\n'
)
if raise_from_handler:
# XXX teardown ops XXX
# on termination these steps say need to be run to
# ensure wider system consistency (like the state of
# remote connections/services).
#
# HOWEVER, any bug in this teardown code is also
# masked by the `tx.aclose()`!
# this is also true if `_tn.cancel_scope` is
# `.cancel_called` by the parent in a graceful
# request case..
# simulate a bug in teardown handler.
raise RuntimeError(
'woopsie teardown bug!'
)
raise # no teardown bug.
async def finite_stream_to_rent(
tx: trio.abc.SendChannel,
child_errors_mid_stream: bool,
task_status: trio.TaskStatus[
trio.CancelScope,
] = trio.TASK_STATUS_IGNORED,
):
async with (
# XXX without this unmasker the mid-streaming RTE is never
# reported since it is masked by the `tx.aclose()`
# call which in turn raises `Cancelled`!
#
# NOTE, this is WITHOUT doing any exception handling
# inside the child task!
#
# TODO, uncomment next LoC to see the supprsessed beg[RTE]!
# tractor.trionics.maybe_raise_from_masking_exc(),
tx as tx, # .aclose() is the guilty masker chkpt!
trio.open_nursery() as _tn,
):
# pass our scope back to parent for supervision\
# control.
task_status.started(_tn.cancel_scope)
with teardown_on_exc(
raise_from_handler=not child_errors_mid_stream,
):
for i in range(100):
log.info(
f'Child tx {i!r}\n'
)
if (
child_errors_mid_stream
and
i == 66
):
# oh wait but WOOPS there's a bug
# in that teardown code!?
raise RuntimeError(
'woopsie, a mid-streaming bug!?'
)
await tx.send(i)
async def main(
# TODO! toggle this for the 2 cases!
# 1. child errors mid-stream while parent is also requesting
# (graceful) cancel of that child streamer.
#
# 2. child contains a teardown handler which contains a
# bug and raises.
#
child_errors_mid_stream: bool,
):
tx, rx = trio.open_memory_channel(1)
async with (
trio.open_nursery() as tn,
rx as rx,
):
_child_cs = await tn.start(
partial(
finite_stream_to_rent,
child_errors_mid_stream=child_errors_mid_stream,
tx=tx,
)
)
async for msg in rx:
log.info(
f'Rent rx {msg!r}\n'
)
# simulate some external cancellation
# request **JUST BEFORE** the child errors.
if msg == 65:
log.cancel(
f'Cancelling parent on,\n'
f'msg={msg}\n'
f'\n'
f'Simulates OOB cancel request!\n'
)
tn.cancel_scope.cancel()
if __name__ == '__main__':
for case in [True, False]:
trio.run(main, case)

View File

@ -313,8 +313,9 @@ async def inf_streamer(
# `trio.EndOfChannel` doesn't propagate directly to the above
# .open_stream() parent, resulting in it also raising instead
# of gracefully absorbing as normal.. so how to handle?
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
async def close_stream_on_sentinel():
async for msg in stream:

View File

@ -532,15 +532,10 @@ def test_cancel_via_SIGINT_other_task(
async def main():
# should never timeout since SIGINT should cancel the current program
with trio.fail_after(timeout):
async with (
# XXX ?TODO? why no work!?
# tractor.trionics.collapse_eg(),
trio.open_nursery(
strict_exception_groups=False,
) as tn,
):
await tn.start(spawn_and_sleep_forever)
async with trio.open_nursery(
strict_exception_groups=False,
) as n:
await n.start(spawn_and_sleep_forever)
if 'mp' in spawn_backend:
time.sleep(0.1)
os.kill(pid, signal.SIGINT)

View File

@ -117,10 +117,9 @@ async def open_actor_local_nursery(
ctx: tractor.Context,
):
global _nursery
async with (
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn
):
async with trio.open_nursery(
strict_exception_groups=False,
) as tn:
_nursery = tn
await ctx.started()
await trio.sleep(10)

View File

@ -11,7 +11,6 @@ import psutil
import pytest
import subprocess
import tractor
from tractor.trionics import collapse_eg
from tractor._testing import tractor_test
import trio
@ -194,10 +193,10 @@ async def spawn_and_check_registry(
try:
async with tractor.open_nursery() as an:
async with (
collapse_eg(),
trio.open_nursery() as trion,
):
async with trio.open_nursery(
strict_exception_groups=False,
) as trion:
portals = {}
for i in range(3):
name = f'a{i}'
@ -339,12 +338,11 @@ async def close_chans_before_nursery(
async with portal2.open_stream_from(
stream_forever
) as agen2:
async with (
collapse_eg(),
trio.open_nursery() as tn,
):
tn.start_soon(streamer, agen1)
tn.start_soon(cancel, use_signal, .5)
async with trio.open_nursery(
strict_exception_groups=False,
) as n:
n.start_soon(streamer, agen1)
n.start_soon(cancel, use_signal, .5)
try:
await streamer(agen2)
finally:

View File

@ -234,8 +234,10 @@ async def trio_ctx(
with trio.fail_after(1 + delay):
try:
async with (
tractor.trionics.collapse_eg(),
trio.open_nursery() as tn,
trio.open_nursery(
# TODO, for new `trio` / py3.13
# strict_exception_groups=False,
) as tn,
tractor.to_asyncio.open_channel_from(
sleep_and_err,
) as (first, chan),

View File

@ -8,7 +8,6 @@ from contextlib import (
)
import pytest
from tractor.trionics import collapse_eg
import trio
from trio import TaskStatus
@ -65,8 +64,9 @@ def test_stashed_child_nursery(use_start_soon):
async def main():
async with (
collapse_eg(),
trio.open_nursery() as pn,
trio.open_nursery(
strict_exception_groups=False,
) as pn,
):
cn = await pn.start(mk_child_nursery)
assert cn
@ -197,8 +197,10 @@ def test_gatherctxs_with_memchan_breaks_multicancelled(
async with (
# XXX should ensure ONLY the KBI
# is relayed upward
collapse_eg(),
trio.open_nursery(), # as tn,
trionics.collapse_eg(),
trio.open_nursery(
# strict_exception_groups=False,
), # as tn,
trionics.gather_contexts([
open_memchan(),

View File

@ -1760,7 +1760,9 @@ async def async_main(
f' {pformat(ipc_server._peers)}'
)
log.runtime(teardown_report)
await ipc_server.wait_for_no_more_peers()
await ipc_server.wait_for_no_more_peers(
shield=True,
)
teardown_report += (
'-]> all peer channels are complete.\n'

View File

@ -814,14 +814,10 @@ class Server(Struct):
async def wait_for_no_more_peers(
self,
# XXX, should this even be allowed?
# -> i've seen it cause hangs on teardown
# in `test_resource_cache.py`
# _shield: bool = False,
shield: bool = False,
) -> None:
await self._no_more_peers.wait()
# with trio.CancelScope(shield=_shield):
# await self._no_more_peers.wait()
with trio.CancelScope(shield=shield):
await self._no_more_peers.wait()
async def wait_for_peer(
self,