Compare commits

..

10 Commits

Author SHA1 Message Date
Gud Boi 30e15925ba Add `start_or_cancel()` to `trionics._taskc`
Wrapper around `trio.Nursery.start()` that DOESN'T mask
out-of-band cancellation as a lossy startup failure.
Picks the right re-raise: ambient `Cancelled` when
present, the genuine startup-protocol `RuntimeError`
otherwise.

The problem,
- `trio.Nursery.start()` raises a generic
  `RuntimeError("child exited without calling
  task_status.started()")` whenever the started task
  exits BEFORE calling `task_status.started()` —
  INCLUDING the common case where the child was
  cancelled out-of-band by an *ancestor* cancel-scope
  erroring/cancelling.
- In that case the original `trio.Cancelled` is
  swallowed and the caller is left w/ an opaque,
  root-cause-detached `RuntimeError`.

The fix,
- Catch the "...started" RTE.
- `await trio.lowlevel.checkpoint_if_cancelled()` —
  re-raises the in-flight `Cancelled` IFF we're under
  effective cancellation (ancestor-inclusive), carrying
  trio's auto-generated reason which points at the true
  root exc.
- If we're NOT cancelled the `checkpoint_if_cancelled()`
  is a cheap no-op and we fall through to re-raise the
  genuine startup-protocol RTE.

Re-export from `tractor.trionics` so callers don't have
to reach into `_taskc`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-29 19:25:33 -04:00
Gud Boi 5b3c2e3762 Add `logspec` leaf-mod Route B follow-up doc
Follow-up note documenting why the deeper "Route B" fix
for `LogSpec`/`apply_logspec()` true per-leaf-MODULE level
control was NOT taken — in favor of the smaller
sub-PACKAGE fix that shipped in 9c36363b.

Doc covers,
- Status: what 9c36363b already gives (per-sub-pkg
  control at any nesting depth, `devx.debug` ≠ `devx`)
  vs. what remains unaddressed (per-leaf-mod levels,
  top-level lib mods like `tractor.to_asyncio` on the
  root logger).
- "Route B" sketch: make logger *identity* the full
  dotted module path; mv the cosmetic leaf-trim out of
  logger-naming into the *formatter's* `{name}`
  rendering.
- 6 breaking-change costs: every logger name changes,
  formatter rewrite, propagation/double-emit surface
  grows, level-inheritance semantics shift,
  `modden`/`piker` contract churn, `get_logger()`
  refactor risk.
- Migration plan if pursued: extract a pure
  `_mk_logger_name()` helper w/ an exhaustive name-shape
  test matrix, swap `get_logger()` to use it for
  identity, swap formatter to use the display string,
  golden-diff rendered headers, coordinate w/
  downstreams.
- "Route A" alternative: a `logging.Filter` keyed on
  `record.module`/`pathname` for per-leaf control w/o
  name churn — lower risk, narrower power.
- Recommendation: defer Route B; prefer Route A if
  per-leaf is needed soon; the shipped sub-PKG fix
  covers the common ask.

Lives under `ai/tooling-todos/` since it's a deferred-
work decision record, not a triage/conc-anal doc.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-29 19:25:08 -04:00
Gud Boi 9c36363b01 Fix `get_logger()` collapse of nested sub-pkgs
Strip the trailing `pkg_path` token ONLY when it duplicates the
caller's leaf-*module* name (which the console header already
shows via `{filename}`), instead of blindly dropping the last
token. This keeps genuine, possibly-*nested* sub-PACKAGE parts
addressable as their own sub-loggers.

- detect a true leaf-mod by comparing the caller's `__name__`
  vs `__package__` (a pkg `__init__` has them equal -> its
  trailing token is a real sub-pkg, NOT a leaf to strip).
- `name='devx.debug'` now -> `tractor.devx.debug`, DISTINCT
  from a bare `devx` -> `tractor.devx`; the old unconditional
  `pkg_path = subpkg_path` collapsed both to `tractor.devx` and
  silently broke per-sub-pkg level control via the logging-spec.
- `get_logger(__name__)` leaf-strip still works (cosmetic, bc
  the leaf-mod is in the `{filename}` header field).

Also,
- update the `LogSpec` caveat: sub-PACKAGE granularity now
  addressable at ANY depth; leaf *modules* intentionally aren't
  (they're the `{filename}`); top-level mods (eg. `to_asyncio`)
  still emit on the root logger.
- adjust `test_root_pkg_not_duplicated_in_logger_name` to the
  new literal explicit-`name` contract (no leaf-collapse).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-29 19:17:55 -04:00
Gud Boi 19a77708ba Lift `--ll`/`--tl` to plugin + `LogSpec` API
Two coupled changes that let downstream projects (eg. `modden`) inherit
the test-harness loglevel plumbing for free via
`tractor._testing.pytest`:

Plugin lift (`tests/conftest.py` → `_testing/pytest.py`),
- mv `pytest_addoption(--ll)`, the `loglevel` autouse
  fixture, and `test_log` fixture out of the test-suite-
  local conftest into the reusable plugin.
- add `--tl`/`--tractor-loglevel` as a DISTINCT flag from
  `--ll`: `--ll` is the consuming-project's OWN app
  loglevel (scoped to its pkg-hierarchy), `--tl` is the
  `tractor.*` runtime loglevel. `--tl` falls back to
  `--ll` when unset (preserves current `tractor`-suite
  behavior).
- add `testing_pkg_name` session fixture (default
  `'tractor'`) — downstream projects override to e.g.
  `'modden'` so `--ll` scopes to their own hierarchy
  instead of `tractor.*`.
- `loglevel` fixture now yields the resolved
  tractor-runtime level (passed to
  `open_root_actor(loglevel=<.>)` by `@tractor_test`)
  AND separately applies `--ll` to the
  `testing_pkg_name` hierarchy when that isn't
  `tractor`. `test_log` scopes the per-test logger to
  `testing_pkg_name`.

`tractor.log` "logging-spec" mini-DSL,
- `LogSpec = str|bool`. Accepted forms:
  - `True` → enable `pkg_name` root at `default_level`
    (fallback `'cancel'`).
  - `False` → no-op.
  - bare level eg. `'info'` → root-logger at that level.
  - `'sub:info,x:cancel'` → per-sub-logger filter-spec;
    each `<name>` is RELATIVE to `pkg_name` (must NOT
    include the pkg-token).
- `parse_logspec()` → `{sublog|None: level}` mapping.
  `None` key = root-logger. Mixed bare-level + filters
  in one spec is rejected w/ a helpful err msg; so is
  embedding the `pkg_name` token in a sub-name.
- `apply_logspec()` → `(primary_level, {name: log})`:
  parses then enables a `colorlog` stderr handler per
  named (sub)logger. Authoritative sub-logger filters
  get `propagate=False` so they don't double-emit
  through a parallel root-level handler.
- !GRANULARITY CAVEAT! sub-logger names match at
  sub-pkg granularity, not leaf-module — so `devx.debug`
  collapses to the same `tractor.devx` logger as a bare
  `devx`, and top-level lib modules (eg.
  `tractor.to_asyncio`) emit under the *root* logger
  rather than a phantom `to_asyncio` child. Documented
  inline on `LogSpec`.

Other,
- `tests/conftest.py` keeps a NOTE pointing to the
  plugin for future-debugging clarity (don't remove
  silently — the lift is the relevant signal).

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-29 17:43:55 -04:00
Gud Boi acd1cbeec4 Add `maybe_signal_aio_task()` + cause-chain guard
Factor the "deliver an exc to a running aio task" pattern out of
`translate_aio_errors()` + `open_channel_from()` into a shared
`maybe_signal_aio_task()` helper. Add a cause-chain matrix comment
+ relay-echo guard so the final-raise block can't cycle
  `trio_err.__cause__` back onto its own derivative relay.

`maybe_signal_aio_task()`,
- Delivers `exc` via `aio_task._fut_waiter.set_exception()` — NOT
  `aio_task.set_exception()` which on py3.13+ ALWAYS raises
  `RuntimeError("Task does not support set_exception")` (dead code as
  a relay mechanism).
- Returns `(delivered: bool, report: str)`. Caller uses `delivered` to
  flip `wait_on_aio_task` when delivery failed (avoids hanging on
  `_aio_task_complete.wait()`).
- `pre_captured_fut=`: required when the caller crosses a trio
  checkpoint between capturing `_fut_waiter` and invoking the helper.
  `Task._wakeup` clears `_fut_waiter = None` so re-reading
  post-checkpoint loses the ref even though the exc is still in-flight
  on the (now-`done()`) original fut.
- `cause=`: sets `exc.__cause__ = cause` so the relay carries
  a "trio_err -> caused -> relay" chain through `set_exception()`
  → `Task._wakeup` → coro raise → `wait_on_coro_final_result`
  → `signal_trio_when_done` → `task.result()`-raise.
- `allow_cancel_fallback=True`: opt-in `aio_task.cancel()` for the
  narrow case where `_fut_waiter is None` AND task is runnable (sitting
  in asyncio's ready queue, not parked on a poke-able future). NEVER
  cancels when `_fut_waiter` carries an in-flight exc — that would race
  + mask the real terminating exc.

`translate_aio_errors()`,
- Replace the two ad-hoc `_fut_waiter.set_exception()`
  / `aio_task.set_exception()` call sites w/ the helper.
- Capture `pre_cp_fut = aio_task._fut_waiter` BEFORE the post-shutdown
  `trio.lowlevel.checkpoint()` (critical: `_wakeup` clears the ref).
- New "cross-loop cause-chain matrix" comment block on the final-raise
  — tabulates every `(trio_err, aio_err, trio_to_raise)` combo into
  exactly one terminal `raise X [from Y]` or early `return`. Covers the
  sibling `signal_trio_when_done()` resolution + the relay-echo
  INVARIANT.
- New relay-echo guard: if `aio_err` is one of OUR OWN signals
  (`TrioTaskExited`/`TrioCancelled`) AND `aio_err.__cause__ is
  trio_err`, raise the bare `trio_err` instead of `trio_err from
  aio_err` (which would CYCLE the cause chain since the relay was itself
  caused-by `trio_err`).
- Drop the stale "the `task.set_exception(aio_taskc)` call MUST NOT
  EXCEPT or this WILL HANG" warning — the helper handles the failure
  path explicitly via `delivered=False` → `wait_on_aio_task = False`.
- Carry `cause=trio_err` on both the cancel-relay (`TrioCancelled`) and
  the graceful-exit relay (`TrioTaskExited`) so the aio-side traceback
  shows the real root.

`open_channel_from()`,
- Adopt the same helper; drop the dead "SHOULD NEVER GET HERE !?!?"
  + `tractor.pause(shield=True)` panic branch.
- Capture in-flight trio-side exc via `sys.exc_info()[1]` and pass as
  `cause=` — non-`None` only when the `try` body raised (graceful exit
  → None).

Other,
- Top-level import: `sys` (for `sys.exc_info()`).
- `run_as_asyncio_guest()`: add commented-out alt `out: Outcome = await
  trio_done_fute` next to the shielded version — exploratory note for
  the longstanding "why is `.shield()` needed?" TODO.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-29 16:41:37 -04:00
Gud Boi 3d9c75b6ed Drop `debug_mode` gate on stackscope SIGUSR1
SIGUSR1 task-tree dumps via `stackscope` should work in
plain (non-pdb) runs too — esp. in infected-`asyncio`
processes where the kernel-default SIGUSR1 disposition is
`Term` (proc dies on `kill -USR1` w/o an installed
handler). Ungate the install path from `_debug_mode` in
both root and sub-actor init; the `use_stackscope` rt-var
+ `TRACTOR_ENABLE_STACKSCOPE` env-var checks remain as
the actual opt-in (e.g. via `--enable-stackscope`).

Deats,
- `_root.open_root_actor`: drop the `debug_mode and ...`
  conjunction around the `enable_stack_on_sig()` call;
  now gated only on the `enable_stack_on_sig` arg itself.
- `_runtime.Actor` sub-actor init: lift the
  `use_stackscope`/`TRACTOR_ENABLE_STACKSCOPE` branch out
  of the `if rvs['_debug_mode']:` block to peer-level.
  The `use_greenback` branch stays inside `_debug_mode`
  (pdb-specific).
- Refresh inline comments on both sites to call out the
  infected-`asyncio` "default SIGUSR1 = terminate proc"
  rationale.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-28 18:35:40 -04:00
Gud Boi 1cafaecf52 Use trace CM helpers in `test_infected_asyncio`
Adopt the `_testing.trace` CM helpers in two MTF-hang-prone
tests so on-timeout we get a fresh
`ptree`/`wchan`/`py-spy` diag snapshot on disk instead of
opaque pytest timeout-kills. Same shape as bd07a95d for
`test_dynamic_pub_sub`.

Deats,
- `test_echoserver_detailed_mechanics`:
  * inner `trio.fail_after` → `fail_after_w_trace`. Adds
    `fail_after_w_trace: FailAfterWTraceFactory` fixture
    param.
  * mv per-backend `timeout` calc to top of test body (was
    interleaved w/ helper defs).
  * factor deep
    `open_nursery`/`open_context`/`open_stream` body into
    `_body()` so the wrapping `main()` stays a 2-liner —
    keeps the nested-CM block at its natural indent level
    instead of pushing it under yet another `async with`.
  * drop `with_timeout: bool` knob + `fa_main()` helper
    (knob was hard-coded `True`).
- `test_sigint_closes_lifetime_stack`:
  * outer `signal.alarm`/`try`/`finally` → single
    `afk_alarm_w_trace(10)` CM. Adds
    `afk_alarm_w_trace: AfkAlarmWTraceFactory` fixture
    param.
  * drop `_AFK_CAP_S` + `armed_alarm` vars (CM owns both).
  * explanatory comment refreshed to mention
    `AFKAlarmTimeout` + the disk-snapshot side effect.

Other,
- Drop debug `return 1e3` short-circuit from `delay()`
  fixture — snuck in as a scratch line, was clobbering the
  proper `debug_mode`-branched return.
- Top-level import: `FailAfterWTraceFactory`,
  `AfkAlarmWTraceFactory` from `tractor._testing.trace`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-18 15:22:26 -04:00
Gud Boi bb239e847f Add `acli.watch` flicker-free alias-loop
Per-terminal optimized `watch`-like xonsh alias that
runs an arbitrary callable alias in a loop inside the
alt-screen buffer with flicker-free repaint. Supersedes
the inline `acli.ptree` polling .xsh snippet (removed
from `_ptree` docstr in favor of
`acli.watch acli.ptree pytest`).

Deats,
- alt-screen entry/exit (`\033[?1049h/l`) + cursor-hide
  (`\033[?25l/h`) wrapped in try/finally so Ctrl-C always
  returns to a pristine shell.
- per-frame draw uses cursor-home (`\033[H`) + per-line
  EL (`\033[K` before each `\n`) + post-draw erase-down
  (`\033[J`) → stale tail chars from a longer prior
  frame are obvi cleared; no full-screen flash.
- SIGWINCH-aware: terminal resize sets a flag, next
  frame does a full clear (`\033[H\033[2J`) instead of
  the cheap cursor-home path.
- Ctrl-C handling: install `signal.default_int_handler`
  so `KeyboardInterrupt` lands cleanly; prior handler
  restored on exit.
- Output capture: redirect the alias's stdout to
  `StringIO` per frame so we can post-process the EL
  fix. Aliases writing directly to `sys.stdout.buffer`
  / `os.write(1)` bypass capture — EL-fix won't apply
  but loop still works.
- Alias unwrap: xonsh stores callables as either a bare
  callable OR `[fn, *preset_args]`. Both forms handled;
  subprocess-style aliases rejected w/ a friendly err
  msg.
- `argparse` w/ `-n`/`--interval` (default 0.3s); rest
  of argv forwarded as alias args.
- Reg `'acli.watch': watch` in `_TCLI_ALIASES`.

Other,
- Tn `_ptree` `args: list[str]` param.
- Mod-header `Provides:` block updated w/ `acli.watch`
  entry.
- Top-level imports: `os`, `sys`, `signal`, `time`,
  `typing.Callable`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-14 19:42:00 -04:00
Gud Boi f617c8cb73 Add `acli.ptree` poll .xsh snippet to docstr 2026-05-14 13:02:22 -04:00
Gud Boi a6d4ac3aac Filter `_find_tractor_strays` by ppid disposition
Only flag `tractor._child` procs as cross-test ghosts of
THIS run if `ppid==1` (init-adopted real leak) or `ppid`
is in the walk's `seen` set (descendant we missed via
race).

Previously, procs whose `ppid` points to some OTHER live non-`pytest`
(in the use of `acli.ptree pytest`) process belong to a different
tractor app (`piker`, another `pytest` shell, a long-running tractor
daemon) and were being falsely flagged as cross-test ghosts.

Deats,
- post-cmdline-match check via `_ppid_from_proc(pid)`,
  short-circuit on `None` (proc died in-flight).
- expand module docstring to spell out the ownership
  filter rule + its rationale.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
2026-05-14 12:39:33 -04:00
13 changed files with 1177 additions and 258 deletions

View File

@ -0,0 +1,159 @@
# Logging-spec leaf-module granularity — "Route B" (decouple
# logger-*identity* from console-*display*)
Follow-up notes recording the breaking-changes / costs of the
deeper fix that would give the `tractor.log` logging-spec (see
`LogSpec`/`apply_logspec()`) true **per-leaf-MODULE** level
control — deliberately *not* taken (for now) in favour of the
smaller sub-PACKAGE fix already landed.
## Status / what already shipped
The cheap, contained fix is **done**: `get_logger()`'s "strip
#2" (`log.py`, the `pkg_path = subpkg_path` collapse) no longer
eats a real sub-package component. It now strips the trailing
token *only* when it duplicates the caller's leaf-*module*
filename (which the header already shows via `{filename}`).
Result:
- `devx.debug` resolves to `tractor.devx.debug`, **distinct**
from a bare `devx` -> `tractor.devx` (its parent). So the
logging-spec can dial sub-package levels at any nesting depth
(`devx.debug:runtime` ≠ `devx:cancel`).
- The `get_logger(__name__)` cosmetic ("don't repeat the leaf
module in `{name}` since `{filename}` shows it") is preserved.
What is **still NOT addressable** after that fix:
- **Per-leaf-MODULE** levels. Every module in a (sub-)pkg shares
that pkg's logger, because `get_logger()` drops the leaf
module-name from the logger key by design.
- **Top-level lib modules** (eg. `tractor.to_asyncio`,
`__package__ == 'tractor'`) emit on the *root* `tractor`
logger, so a `to_asyncio:<lvl>` spec entry hits a phantom
child -> no-op.
## What "Route B" is
Make the logger's *identity* the **full dotted module path**
(incl. the leaf module + top-level modules), eg.
`tractor.devx.debug._tty_lock` and `tractor.to_asyncio`, and
move the cosmetic leaf-trim out of logger-naming and into the
**formatter's `{name}` rendering**.
Net effect:
- Real per-module `Logger` nodes exist in the hierarchy ->
the spec can target ANY module; stdlib level-inheritance and
propagation "just work" top-down.
- Console headers stay clean because the formatter computes a
trimmed display string (drop the trailing token that equals
`{filename}`'s stem) instead of the logger doing it.
## Why it's "broad" — breaking changes / costs
The logger *name* is currently load-bearing well beyond
display; changing it ripples:
1. **Every logger name changes.**
Today (post sub-pkg fix) names collapse to the sub-package;
Route B = full module path. This touches:
- handler attachment points + the `getChild()` hierarchy,
- any `logging.getLogger('tractor.X')` string lookups,
- any name-based filtering,
- the dedup / `_strict_debug` warning logic *inside*
`get_logger()` itself — the `pkg_name in name`,
`leaf_mod in pkg_path`, "duplicate pkg-name" branches all
key off the *name shape* and would need re-derivation.
2. **Formatter rewrite.**
`LOG_FORMAT` uses `{name}` == `record.name` (the full logger
name). To keep headers clean we must compute a *display*
name and inject it as a record attr (eg. `record.pkg_ns`)
via a `logging.Filter` or a `colorlog.ColoredFormatter`
subclass overriding `.format()`, then point `LOG_FORMAT` at
that field. The `{filename}` vs `{name}` de-dup intent has
to be re-implemented per-record rather than per-logger.
3. **Propagation / double-emit surface grows.**
Full-depth loggers mean more intermediate nodes
(`...debug._tty_lock` -> `.debug` -> `.devx` -> `tractor`).
If more than one level carries a handler (spec sub-handlers
+ a root console), records double-emit. The
`propagate=False` trick we already use for filter-targeted
sub-loggers (`apply_logspec()`) must be applied carefully
across a deeper tree — more levels == more places to leak a
dup.
4. **Level-inheritance semantics shift.**
Today setting a level on `tractor.devx` gates *all* devx
emits (they share that logger). Post-Route-B,
`tractor.devx.debug._tty_lock` is its own `NOTSET` logger
that *inherits* the effective level from ancestors —
functionally similar via inheritance, BUT any code that does
`log.setLevel(...)` / reads `log.level` on a (previously
collapsed) logger now only affects that exact node. All
`setLevel`/`.level =` call sites need an audit (eg.
`get_logger()`'s own `log.level = rlog.level` line).
5. **Downstream contract churn.**
`modden` / `piker` call `get_logger()` / `get_console_log()`
and may depend on current names — including
`modden.runtime.daemon.setup_tractor_logging()` which
asserts `'tractor' not in name` on spec parts. The header
`{name}` field is user-visible in everyone's logs + CI
output. Changing the canonical names is a public-ish
behavior change -> needs a version note + downstream
coordination (or a formatter trim that keeps the *displayed*
string byte-identical to today).
6. **`get_logger()` refactor risk.**
The fn tangles two concerns: compute logger *identity* and
compute the *display* string. Route B forces splitting them
inside a ~300-line fn with multiple `_strict_debug`
branches, dup-warnings, and the `name=__name__` convenience.
High chance of subtle regressions without an exhaustive
name-derivation test matrix.
## Migration / test plan (if pursued)
- Extract a pure helper
`_mk_logger_name(pkg_name, mod_name, mod_pkg) -> (logger_name,
display_name)` and cover it with an exhaustive unit matrix:
auto vs explicit vs `__name__`; package-`__init__` vs leaf
module; nested vs flat; `pkg_name in name` vs not; top-level
module (`__package__ == pkg_name`).
- Switch `get_logger()` to use it for *identity*; switch the
formatter to use `display_name` (via a record attr).
- Re-run the full suite + golden-diff a sample of rendered log
headers to confirm zero cosmetic churn.
- Coordinate the name change with `modden`/`piker`; bump +
CHANGES note.
## Cheaper alternative — "Route A" (record-filter)
If per-leaf control is wanted *before* committing to Route B:
keep names collapsed, add a `logging.Filter` on the configured
handler keyed on `record.module` / `record.pathname` that maps
each record's source module -> its spec level. Set the base
logger to the *minimum* level in the spec (so records aren't
pre-dropped by the logger), and let the filter discriminate
up/down within that floor.
- Pros: no name churn, no formatter change, fully contained
next to `apply_logspec()`.
- Cons: a filter can only discriminate *within* what the logger
admits -> base must be permissive, so `at_least_level()`
expensive-work guards over-admit; matching dotted spec names
to a `pathname` is fiddly; doesn't clean up the hierarchy
itself.
## Recommendation
- Defer Route B unless true per-module loggers are wanted as a
first-class feature.
- If per-leaf control is needed soon, prefer **Route A**
(filter) — lower risk.
- The shipped sub-PACKAGE fix already covers the common ask
(`devx.debug` vs `devx`).

View File

@ -120,74 +120,13 @@ def cpu_scaling_factor() -> float:
return 1.
def pytest_addoption(
parser: pytest.Parser,
):
# ?TODO? should this be exposed from our `._testing.pytest`
# plugin or should we make it more explicit with `--tl` for
# tractor logging like we do in other client projects?
parser.addoption(
"--ll",
action="store",
dest='loglevel',
default=None,
help="logging level to set when testing",
)
@pytest.fixture(scope='session', autouse=True)
def loglevel(
request: pytest.FixtureRequest,
) -> str|None:
import tractor
orig = tractor.log._default_loglevel
flag_level: str|None = request.config.option.loglevel
if flag_level is not None:
tractor.log._default_loglevel = flag_level
log = tractor.log.get_console_log(
level=flag_level,
name='tractor', # <- enable root logger
)
log.info(
f'Test-harness set runtime loglevel: {flag_level!r}\n'
)
yield flag_level
tractor.log._default_loglevel = orig
@pytest.fixture(scope='function')
def test_log(
request: pytest.FixtureRequest,
loglevel: str,
) -> tractor.log.StackLevelAdapter:
'''
Deliver a per test-module-fn logger instance for reporting from
within actual test bodies/fixtures.
For example this can be handy to report certain error cases from
exception handlers using `test_log.exception()`.
'''
modname: str = request.function.__module__
log = tractor.log.get_logger(
name=modname, # <- enable root logger
# pkg_name='tests',
)
_log = tractor.log.get_console_log(
level=loglevel,
logger=log,
name=modname,
# pkg_name='tests',
)
_log.debug(
f'In-test-logging requested\n'
f'test_log.name: {log.name!r}\n'
f'level: {loglevel!r}\n'
)
yield _log
# NOTE, the `--ll`/`--tl` CLI flags + the `loglevel`, `test_log`
# and `testing_pkg_name` fixtures have been factored into the
# `tractor._testing.pytest` plugin (loaded via the `-p` entry in
# `pyproject.toml`'s `[tool.pytest.ini_options]`) so downstream
# consuming projects (eg. `modden`) inherit them for free. The
# plugin's `testing_pkg_name` fixture defaults to `'tractor'`, so
# this suite keeps treating `--ll` as the runtime loglevel.
@pytest.fixture(scope='session')

View File

@ -30,6 +30,10 @@ from tractor import (
from tractor.runtime import _state
from tractor.trionics import BroadcastReceiver
from tractor._testing import expect_ctxc
from tractor._testing.trace import (
AfkAlarmWTraceFactory,
FailAfterWTraceFactory,
)
# Per-test zombie-subactor reaper. Opt-in (NOT autouse) —
@ -58,7 +62,6 @@ pytestmark = pytest.mark.usefixtures(
scope='module',
)
def delay(debug_mode: bool) -> int:
return 1e3
if debug_mode:
return 999
else:
@ -846,8 +849,25 @@ def test_echoserver_detailed_mechanics(
raise_error_mid_stream,
is_forking_spawner: bool,
fail_after_w_trace: FailAfterWTraceFactory,
):
async def main():
# NOTE: under fork-based backends the cancel-cascade
# path is structurally slower than `trio`'s subproc-exec
# (per-spawn forkserver-handshake compounds during
# teardown). Bump the cap so cross-test contamination
# doesn't flake this — see
# `ai/conc-anal/cancel_cascade_too_slow_under_main_thread_forkserver_issue.md`.
timeout: float = (
999 if tractor.debug_mode()
else 4 if is_forking_spawner
else 1
)
# body factored out so the `fail_after_w_trace`-wrapping
# `main()` stays a 2-liner — keeps the deep `open_nursery`
# /`open_context`/`open_stream` block at its natural indent
# level instead of pushing it under yet another `async with`.
async def _body():
async with tractor.open_nursery(
registry_addrs=[reg_addr],
debug_mode=debug_mode,
@ -893,34 +913,21 @@ def test_echoserver_detailed_mechanics(
# is cancelled by kbi or out of task cancellation
await p.cancel_actor()
# NOTE: under fork-based backends the cancel-cascade
# path is structurally slower than `trio`'s subproc-exec
# (per-spawn forkserver-handshake compounds during
# teardown). Bump the cap so cross-test contamination
# doesn't flake this — see
# `ai/conc-anal/cancel_cascade_too_slow_under_main_thread_forkserver_issue.md`.
timeout: float = (
999 if tractor.debug_mode()
else 4 if is_forking_spawner
else 1
)
with_timeout: bool = (
True
# False
)
async def fa_main():
if with_timeout:
with trio.fail_after(timeout):
await main()
else:
await main()
async def main():
# on-timeout diag snapshot via `fail_after_w_trace`
# — when the cancel cascade hangs under MTF we get a
# fresh `ptree`/`wchan`/`py-spy` dump on disk INSTEAD
# of an opaque pytest timeout-kill. See
# `tractor/_testing/trace.py`.
async with fail_after_w_trace(timeout):
await _body()
if raise_error_mid_stream:
with pytest.raises(raise_error_mid_stream):
trio.run(fa_main)
trio.run(main)
else:
trio.run(fa_main)
trio.run(main)
@tractor.context
@ -1074,6 +1081,7 @@ def test_sigint_closes_lifetime_stack(
trio_side_is_shielded: bool,
send_sigint_to: str,
is_forking_spawner: bool,
afk_alarm_w_trace: AfkAlarmWTraceFactory,
):
'''
Ensure that an infected child can use the `Actor.lifetime_stack`
@ -1221,32 +1229,26 @@ def test_sigint_closes_lifetime_stack(
assert not tmp_file.exists()
assert ctx.maybe_error
# outer signal-based AFK-safety guard. mirrors the pattern in
# `tests/test_advanced_streaming.py::test_dynamic_pub_sub`: when
# the in-band trio cancel path doesn't fire (e.g. parent is
# parked in a shielded `await` inside actor-nursery teardown, or
# `open_context.__aenter__` hangs waiting for a child's
# `StartAck` that never comes), `signal.alarm` raises KBI in the
# main thread regardless of trio's scope state. This caps the
# absolute wall-clock so an AFK run can't sit for an hour on a
# forkserver-launchpad-contamination hang. Only armed under fork-
# based backends since the bug class is MTF-specific.
_AFK_CAP_S: int = (
999 if debug_mode
else 10
)
armed_alarm: bool = (
# outer hard wall-clock backstop via `afk_alarm_w_trace`:
# when the in-band trio cancel path doesn't fire (e.g.
# parent is parked in a shielded `await` inside actor-
# nursery teardown, or `open_context.__aenter__` hangs
# waiting for a child's `StartAck` that never comes), the
# `signal.alarm` inside the CM raises `AFKAlarmTimeout`
# in the main thread regardless of trio's scope state —
# AND captures a full diag snapshot to
# `$XDG_CACHE_HOME/tractor/hung-dumps/` before re-raising.
# Only armed under fork-based backends since this hang-
# class is MTF-specific.
if (
not debug_mode
and
is_forking_spawner
)
if armed_alarm:
signal.alarm(_AFK_CAP_S)
try:
):
with afk_alarm_w_trace(10):
trio.run(main)
else:
trio.run(main)
finally:
if armed_alarm:
signal.alarm(0)

View File

@ -20,6 +20,16 @@ def test_root_pkg_not_duplicated_in_logger_name():
a common `<root_name>.< >` prefix, ensure that it is not
duplicated in the child's `StackLevelAdapter.name: str`.
Also pins the explicit-`name` contract: an explicitly passed
dotted `name` is treated as a *literal* sub-logger path and is
NOT leaf-collapsed. The leaf-module is only dropped when the
trailing token duplicates the *caller's own* `__name__` leaf (the
`{filename}` field) see `test_implicit_mod_name_applied_for_child`
for that (auto-naming) path. This is what keeps a real (possibly
nested) sub-PACKAGE like `subpkg.mod` -> `devx.debug` addressable
by the `tractor.log` logging-spec, instead of collapsing to its
parent.
'''
project_name: str = 'pylib'
pkg_path: str = 'pylib.subpkg.mod'
@ -38,8 +48,13 @@ def test_root_pkg_not_duplicated_in_logger_name():
)
assert proj_log is not sublog
# the root pkg-name appears exactly once (no `pylib.pylib...`)
assert sublog.name.count(proj_log.name) == 1
assert 'mod' not in sublog.name
# explicit dotted `name` is preserved literally (NOT collapsed);
# the trailing token survives since it's not the *caller's* own
# leaf-module (`test_log_sys`), so this is treated as a literal
# sub-pkg path.
assert sublog.name == f'{project_name}.subpkg.mod'
def test_implicit_mod_name_applied_for_child(

View File

@ -435,11 +435,12 @@ async def open_root_actor(
)
# TODO: factor this into `.devx._stackscope`!!
if (
debug_mode
and
enable_stack_on_sig
):
#
# NOTE, intentionally NOT gated on `debug_mode` so SIGUSR1
# task-tree dumps work in plain (non-pdb) runs too — esp.
# in infected-`asyncio` root processes where the default
# SIGUSR1 action would otherwise terminate the proc.
if enable_stack_on_sig:
from .devx._stackscope import enable_stack_on_sig
enable_stack_on_sig()

View File

@ -405,6 +405,46 @@ def pytest_addoption(
help="Transport protocol to use under the `tractor.ipc.Channel`",
)
# console loglevel for the test-session, scoped to the
# consuming-project's OWN pkg-hierarchy (see the
# `testing_pkg_name` fixture). For `tractor` itself this IS the
# runtime loglevel; downstream projects use `--ll` for their own
# ("internal") app-logging and `--tl` for tractor-as-runtime.
parser.addoption(
"--ll",
"--loglevel",
action="store",
dest='loglevel',
default=None,
help=(
"console loglevel to set for the test session, scoped to "
"the consuming-project pkg (see `testing_pkg_name`). "
"Falls through as the `--tl` default."
),
)
# tractor-as-runtime loglevel, DISTINCT from `--ll` so downstream
# projects can split their app-logs from the `tractor.*` runtime
# hierarchy. Accepts a `tractor.log` "logging-spec" (see
# `tractor.log.apply_logspec()`).
parser.addoption(
"--tl",
"--tractor-loglevel",
action="store",
dest='tractor_loglevel',
default=None,
help=(
"loglevel (or logging-spec) for `tractor`-as-runtime, "
"distinct from `--ll`. Accepts a bare level (eg. "
"'info', 'cancel') or a sub-logger filter-spec, "
"'<sublog>:<level>,...' (eg. "
"'devx:runtime,trionics:cancel'). Falls back to `--ll` "
"when unset. Mirrors the logging-spec grammar consumed "
"by `tractor.log.apply_logspec()` (see its sub-pkg "
"granularity caveat)."
),
)
def pytest_configure(
config: pytest.Config,
@ -547,6 +587,135 @@ def debug_mode(
return debug_mode
@pytest.fixture(scope='session')
def testing_pkg_name() -> str:
'''
Root pkg-name of the project consuming this plugin, used to
scope `--ll` "internal"/app-level console logging into that
project's OWN `tractor.log.get_logger(pkg_name=<.>)` hierarchy
distinct from the `tractor.*` runtime hierarchy configured
via `--tl`.
Defaults to `'tractor'` (so tractor's own suite treats `--ll`
as the runtime level). Downstream projects override this from
their `conftest.py`, eg.
.. code:: python
@pytest.fixture(scope='session')
def testing_pkg_name() -> str:
return 'modden'
'''
return 'tractor'
@pytest.fixture(
scope='session',
autouse=True,
)
def loglevel(
request: pytest.FixtureRequest,
testing_pkg_name: str,
) -> str|None:
'''
Resolve + apply the test-session console loglevels and yield
the `tractor`-runtime level (also passed to
`open_root_actor(loglevel=<.>)` by `@tractor_test`).
- `--tl <logspec>`: tractor-runtime level (falls back to the
generic `--ll`); applied to the `tractor.*` logger hierarchy
and `tractor.log._default_loglevel` via
`tractor.log.apply_logspec()`.
- `--ll <level>`: the consuming-project's OWN console loglevel,
applied to its `testing_pkg_name` hierarchy when that isn't
`tractor` itself.
'''
import tractor
orig: str = tractor.log._default_loglevel
ll: str|None = request.config.option.loglevel
tl: str|None = request.config.option.tractor_loglevel
# tractor-runtime loglevel: explicit `--tl` wins, else fall
# back to the generic `--ll`, else leave the lib default.
logspec: str|None = tl if tl is not None else ll
tractor_level: str|None = None
if logspec is not None:
tractor_level, _ = tractor.log.apply_logspec(
logspec,
default_level=ll,
pkg_name='tractor',
)
if tractor_level is not None:
tractor.log._default_loglevel = tractor_level
# consuming-project ("internal") console logging at the generic
# `--ll` level, scoped to ITS OWN pkg-hierarchy (NOT `tractor.*`)
# so downstream projects can split app-logs from runtime-logs.
if (
ll is not None
and
testing_pkg_name
and
testing_pkg_name != 'tractor'
):
tractor.log.get_console_log(
level=ll,
pkg_name=testing_pkg_name,
name=testing_pkg_name,
)
log = tractor.log.get_console_log(
level=tractor_level,
name='tractor', # <- enable root logger
)
log.info(
f'Test-harness set session loglevels:\n'
f'tractor-runtime (`--tl`/`--ll`): {tractor_level!r}\n'
f'{testing_pkg_name!r} (`--ll`): {ll!r}\n'
)
yield tractor_level
tractor.log._default_loglevel = orig
@pytest.fixture(scope='function')
def test_log(
request: pytest.FixtureRequest,
loglevel: str,
testing_pkg_name: str,
) -> tractor.log.StackLevelAdapter:
'''
Deliver a per test-module-fn logger instance for reporting from
within actual test bodies/fixtures.
For example this can be handy to report certain error cases from
exception handlers using `test_log.exception()`.
The logger is scoped to the consuming-project's
`testing_pkg_name` hierarchy so downstream suites' in-test logs
land under their own pkg, not `tractor.*`.
'''
modname: str = request.function.__module__
log = tractor.log.get_logger(
name=modname,
pkg_name=testing_pkg_name,
)
_log = tractor.log.get_console_log(
level=loglevel,
logger=log,
name=modname,
)
_log.debug(
f'In-test-logging requested\n'
f'test_log.name: {log.name!r}\n'
f'level: {loglevel!r}\n'
)
yield _log
@pytest.fixture(scope='session')
def spawn_backend(
request: pytest.FixtureRequest,

View File

@ -224,7 +224,19 @@ def _find_tractor_strays(seen: set[int]) -> list[int]:
'''
Scan `/proc/*/cmdline` (+ `/comm` as zombie-safe fallback) for
`tractor._child` / `tractor[<aid>]` proctitle matches whose
`pid` is NOT in the `seen` set.
`pid` is NOT in the `seen` set AND whose `ppid` disposition
indicates the proc belongs to THIS test run's process tree:
- `ppid == 1` init-adopted (parent died) a real leaked
subactor from this (or a prior killed) test run.
- `ppid in seen` subtree-descendant the recursive walk
missed due to a race (proc appeared between iterations).
Procs whose `ppid` points to some OTHER live, non-pytest
process are skipped they belong to a different tractor app
(e.g. `piker`, another `pytest` invocation in another shell,
a long-running tractor daemon) and falsely flagging them as
"cross-test ghosts" of THIS run is misleading.
Used by `dump_proc_tree(include_strays=True)` to surface ghost
subactor trees from PRIOR test runs that aren't descendants of
@ -256,7 +268,17 @@ def _find_tractor_strays(seen: set[int]) -> list[int]:
pid: int = int(entry.name)
if pid in seen:
continue
if _is_tractor_subactor(pid):
if not _is_tractor_subactor(pid):
continue
# ownership filter: only flag procs whose `ppid` ties them
# back to THIS test run (init-adopted orphan, or a
# descendant the walk missed).
ppid: int | None = _ppid_from_proc(pid)
if ppid is None:
# proc disappeared between `iterdir()` and `stat` —
# treat as gone, don't flag.
continue
if ppid == 1 or ppid in seen:
strays.append(pid)
return sorted(strays)

View File

@ -543,21 +543,45 @@ def get_logger(
# only includes the first 2 sub-pkg name-tokens in the
# child-logger's name; the colored "pkg-namespace" header
# will then correctly show the same value as `name`.
#
# XXX, strip the trailing `pkg_path` token ONLY when it
# duplicates the caller's leaf-*module* name — which the
# console header already renders via its `{filename}` field.
# We compare against the caller module's `__name__`/
# `__package__` (rather than blindly dropping the last token)
# so genuine, possibly-*nested* sub-PACKAGE components stay
# addressable as their own sub-loggers:
#
# - `name='trionics._broadcast'` (a leaf-module, from a
# `get_logger(__name__)`-style call) -> `tractor.trionics`
# (leaf dropped; `_broadcast.py` is in the header).
# - `name='devx.debug'` (a real sub-PACKAGE, whether
# auto-derived from a module's `__package__` or passed
# explicitly by a logging-spec) -> `tractor.devx.debug`,
# DISTINCT from a bare `devx` -> `tractor.devx`.
#
# The previous unconditional `pkg_path = subpkg_path` also ate
# the deepest sub-pkg, collapsing `devx.debug` -> `tractor.devx`
# and silently breaking per-sub-pkg level control via the
# logging-spec; see `tractor.log.LogSpec`/`apply_logspec()`.
caller_leaf_mod: str|None = None
if (caller_mod := get_caller_mod()):
cmod_name: str = getattr(caller_mod, '__name__', '') or ''
cmod_pkg: str = getattr(caller_mod, '__package__', '') or ''
# a leaf-*module* has `__name__ != __package__`; a package
# `__init__` has them equal (so its trailing token is a
# real sub-pkg, NOT a leaf-module-filename to strip).
if cmod_name and cmod_name != cmod_pkg:
caller_leaf_mod = cmod_name.rpartition('.')[2]
if (
# XXX, TRY to remove duplication cases
# which get warn-logged on below!
(
# when, subpkg_path == pkg_path
subpkg_path
and
rname == pkg_name
)
# ) or (
# # when, pkg_path == leaf_mod
# pkg_path
# and
# leaf_mod == pkg_path
# )
and
# only collapse when the trailing token IS the caller's
# leaf-module (i.e. the `{filename}` already shows it).
leaf_mod == caller_leaf_mod
):
pkg_path = subpkg_path
@ -711,6 +735,167 @@ def get_console_log(
return log
# A `tractor` "logging-spec": a compact, code-free way for a
# consuming project's test-iface (or runtime) to dial-in console
# loglevels across the lib's logger hierarchy. Mirrors the grammar
# consumed by `modden.runtime.daemon.setup_tractor_logging()`.
#
# Accepted forms (`str|bool`),
# - `True` -> enable the `pkg_name` root-logger at
# `default_level` (or 'cancel').
# - `False` -> disable (no-op, configure nothing).
# - 'info' -> a bare level for the root-logger.
# - 'sub:info,x:cancel' -> per-sub-logger levels; each `<name>` is
# RELATIVE to `pkg_name` (must NOT include
# the `pkg_name` token itself), eg.
# 'devx.debug:runtime,trionics:cancel'.
#
# !GRANULARITY! sub-logger names match at the `pkg_name.<name>`
# *logger* level — which (per `get_logger()`'s name-derivation) is
# *sub-PACKAGE* granularity, addressable at ANY nesting depth:
# - 'devx.debug' -> the `tractor.devx.debug` logger, DISTINCT from a
# bare 'devx' -> `tractor.devx` (its parent). Setting `devx` also
# gates `devx.debug` via normal stdlib level-inheritance unless the
# child sets its own level.
# - leaf *modules* are intentionally NOT individually addressable:
# `get_logger()` drops the leaf module-name from the logger key
# since the console header already renders it via `{filename}`, so
# every module in a (sub-)pkg shares that pkg's logger. Per-leaf
# level control would need a record-filter (see follow-up notes:
# `ai/tooling-todos/logspec_leaf_module_granularity_route_b.md`).
# - top-level lib modules (eg. `tractor.to_asyncio`) emit under the
# *root* `pkg_name` logger (their `__package__` IS `pkg_name`), so
# a 'to_asyncio:<level>' entry targets a phantom child that nothing
# emits to -> no-op. Use the bare-level/root form for those.
LogSpec = str|bool
def parse_logspec(
logspec: LogSpec,
default_level: str|None = None,
pkg_name: str = _proj_name,
) -> dict[str|None, str]:
'''
Parse a `tractor` "logging-spec" (see `LogSpec`) into a
`{sublog_name|None: level}` mapping where a `None` key denotes
the `pkg_name` root-logger itself.
'''
match logspec:
# explicit disable -> configure nothing.
case False:
return {}
# enable the root-logger at the fallback level.
case True:
return {None: (default_level or 'cancel')}
case str(spec):
filters: list[str] = [
part.strip()
for part in spec.split(',')
if part.strip()
]
# i. a bare level (no sub-logger filtering),
# eg. 'info' | 'cancel'
if (
len(filters) == 1
and
':' not in filters[0]
):
return {None: filters[0]}
# ii. a per-sub-logger filter-spec of the form,
# '<sublog_0>:<level>,<.. N-other-parts>'
# eg. 'to_asyncio:cancel,devx._debug:runtime'
out: dict[str|None, str] = {}
for log_filter in filters:
name, sep, level = log_filter.partition(':')
if not sep:
raise ValueError(
f'Invalid `tractor` logging-spec part!\n'
f'{log_filter!r}\n'
f'\n'
f'Mixed bare-level + sub-logger filters are '
f'not supported; every comma-part must be '
f'`<sublog>:<level>`.\n'
)
# the sub-logger name is RELATIVE to `pkg_name`;
# duplicating the pkg-token is a user error since
# the root-logger already IS `pkg_name`.
if pkg_name in name.split('.'):
raise ValueError(
f'logging-spec sub-name should NOT include '
f'the `pkg_name={pkg_name!r}` token!\n'
f'got name={name!r}\n'
)
out[name] = level
return out
case _:
raise ValueError(
f'Invalid `tractor` logging-spec!\n'
f'{logspec!r}\n'
)
def apply_logspec(
logspec: LogSpec,
default_level: str|None = None,
pkg_name: str = _proj_name,
) -> tuple[
str|None,
dict[str, StackLevelAdapter],
]:
'''
Parse + apply a `tractor` "logging-spec" (see `parse_logspec()`):
enable a `colorlog` stderr console handler for each
(sub-)logger named in the spec at its requested level.
Returns a 2-tuple,
- the resolved "primary" runtime-level: the root-logger level if
the spec set one, else `default_level`; suitable for passing
to `open_root_actor(loglevel=<.>)`,
- a `{logger_name: StackLevelAdapter}` map of every logger the
spec touched.
'''
specs: dict[str|None, str] = parse_logspec(
logspec,
default_level=default_level,
pkg_name=pkg_name,
)
logs: dict[str, StackLevelAdapter] = {}
for sub_name, level in specs.items():
# NOTE, pass the RELATIVE sub-name (no `pkg_name.` prefix)
# to avoid `get_logger()`'s duplicate-pkg-token warning;
# it re-adds the pkg-name via `.getChild()` internally.
log: StackLevelAdapter = get_console_log(
level=level,
pkg_name=pkg_name,
name=(sub_name or pkg_name),
)
# XXX, a sub-logger filter is "authoritative" for its
# subtree: it gets its OWN stderr handler (added by
# `get_console_log()` above), so DON'T also let its records
# propagate up to a root `pkg_name`-logger handler — that
# would double-emit every line when a root-level console
# (eg. via `--ll`) is also active. The root-level form
# (`sub_name is None`) keeps default propagation.
if sub_name is not None:
log.logger.propagate = False
logs[log.name] = log
primary_level: str|None = specs.get(None, default_level)
return (
primary_level,
logs,
)
def get_loglevel() -> str:
return _default_loglevel

View File

@ -934,13 +934,18 @@ class Actor:
rvs: dict[str, Any] = spawnspec._runtime_vars
# `stackscope` SIGUSR1 handler: install when EITHER
# `_debug_mode=True` (full multi-actor pdb support
# path) OR the `TRACTOR_ENABLE_STACKSCOPE` env var
# is set (lighter test-time hang-debug path; see
# `use_stackscope` is set in rt-vars OR the
# `TRACTOR_ENABLE_STACKSCOPE` env var is set (lighter
# test-time hang-debug path; see
# `tractor._testing.pytest`'s `--enable-stackscope`
# CLI flag — env var propagates via fork-inherited
# environ).
if rvs['_debug_mode']:
#
# NOTE, intentionally NOT gated on `_debug_mode` so
# SIGUSR1 task-tree dumps work in plain (non-pdb)
# runs too — esp. in infected-`asyncio` sub-actors
# where the default SIGUSR1 action would otherwise
# terminate the proc.
if (
rvs.get('use_stackscope')
or
@ -962,6 +967,7 @@ class Actor:
'debug mode / `--enable-stackscope`!'
)
if rvs['_debug_mode']:
if rvs.get('use_greenback', False):
from ..devx import maybe_init_greenback
maybe_mod: ModuleType|None = await maybe_init_greenback()

View File

@ -27,6 +27,7 @@ from contextlib import asynccontextmanager as acm
from dataclasses import dataclass
import inspect
import platform
import sys
import traceback
from typing import (
Any,
@ -810,6 +811,151 @@ def _run_asyncio_task(
return chan
def maybe_signal_aio_task(
aio_task: asyncio.Task,
exc: BaseException,
*,
cause: BaseException|None = None,
pre_captured_fut: asyncio.Future|None = None,
allow_cancel_fallback: bool = False,
) -> tuple[bool, str]:
'''
Best-effort delivery of `exc` to a still-running `aio_task`
via its `_fut_waiter` (the `asyncio.Future` the task is
currently `await`-ing on).
Returns `(delivered, report)` where `delivered=True` iff
either,
- `fut.set_exception(exc)` was successfully called on an
un-`done()` `_fut_waiter`, OR
- the cancel-fallback path fired (only when the caller
opted-in via `allow_cancel_fallback=True`).
Why `_fut_waiter.set_exception(exc)` and NOT
`aio_task.set_exception(exc)`:
On py3.13+ `asyncio.Task.set_exception()` ALWAYS raises
`RuntimeError("Task does not support set_exception
operation")` — so calling it as a relay mechanism is dead
code. The `_fut_waiter` is a plain `asyncio.Future` and
its `set_exception()` works on all Python versions; the
task's `_wakeup` callback then propagates the exc into
the coro on its next tick.
Why we PREFER NOT to call `aio_task.cancel()`:
`Task.cancel()` injects a `CancelledError` that races
any in-flight exception already queued on `_fut_waiter`
(e.g. via a prior `set_exception()` from a sibling
teardown path). The race can mask BOTH the original
trio-side error and any asyncio-side error the task was
mid-raising. See the
`test_trio_closes_early_and_channel_exits` hang TODO
around the `translate_aio_errors` finally for the
historical artifact.
However a caller may have NO OTHER way to terminate the
task when `_fut_waiter is None` AND the task is busy
looping / runnable, neither `set_exception` nor a chan
close can poke it. In that narrow case `cancel()` is the
only available termination signal; opt-in via
`allow_cancel_fallback=True`. The fallback NEVER runs
when `_fut_waiter` carries an in-flight exc (the
`fut.done()` branch); only when there's truly no
`_fut_waiter` ref to poke.
Pre-checkpoint capture:
`asyncio.Task._wakeup` clears `_fut_waiter = None` as
part of the wakeup sequence. If the caller crosses a
trio checkpoint between fut-capture and this call,
re-reading `aio_task._fut_waiter` will see `None` even
though the exc is still in flight on the (now-`done()`)
original fut. Pass `pre_captured_fut` to use the
already-captured reference.
Causal chaining via `cause`:
Pass the underlying trio-side exc (the *reason* we're
poking the aio side) via `cause` and the helper sets
`exc.__cause__ = cause`. The chain travels with `exc`
through `_fut_waiter.set_exception()` `Task._wakeup`
coro raise `wait_on_coro_final_result`'s except →
`signal_trio_when_done`'s `task.result()`-`raise
aio_err`. The final traceback then renders as
"<trio-side exc> -> (direct cause of) -> <relay exc>"
instead of an opaque, root-cause-detached relay.
See the "cross-loop cause-chain matrix" comment in
`translate_aio_errors()`'s final-raise block for how this
`cause` interacts with every `raise X [from Y]` exit path
(esp. the relay-echo guard which prevents a cause CYCLE).
'''
if cause is not None and exc.__cause__ is None:
exc.__cause__ = cause
if aio_task.done():
return False, (
f'aio-task already done; nothing to signal\n'
f' |_{aio_task!r}\n'
)
fut: asyncio.Future|None = (
pre_captured_fut
if pre_captured_fut is not None
else aio_task._fut_waiter
)
if fut and not fut.done():
fut.set_exception(exc)
return True, (
f'signalled aio-task via `_fut_waiter.set_exception()`\n'
f'exc: {exc!r}\n'
f' |_{aio_task!r}\n'
)
if fut and fut.done():
# NEVER cancel here even when `allow_cancel_fallback=True`
# — the in-flight exc on `fut` will terminate the task
# on its next tick; injecting `CancelledError` on top
# would race and mask the real exc.
return False, (
f'`_fut_waiter` already signalled with,\n'
f' |_{fut.exception()!r}\n'
f'aio-task will exit on next tick via the in-flight exc;\n'
f'SKIPPING re-signal (would race in-flight delivery).\n'
f' |_{aio_task!r}\n'
)
# fut is None — task is runnable (sitting in asyncio's
# ready queue), not parked on a future we can poke.
if allow_cancel_fallback:
cancel_msg: str = (
f'\n'
f'MANUALLY Cancelling `asyncio`-task: '
f'{aio_task.get_name()}!\n\n'
f'**THIS CAN SILENTLY SUPPRESS ERRORS FYI\n\n'
)
aio_task.cancel(msg=cancel_msg)
return True, (
f'aio-task has no `_fut_waiter`; FALLBACK cancel issued\n'
f'(caller opted-in via `allow_cancel_fallback=True`).\n'
f'{cancel_msg}'
f' |_{aio_task!r}\n'
)
return False, (
f'aio-task has no `_fut_waiter`; cannot signal without\n'
f'`aio_task.cancel()` which can mask errors.\n'
f'LEAVING AS-IS (caller did NOT opt-in to cancel fallback);\n'
f'task should exit via chan close / aio-loop teardown\n'
f'already in flight.\n'
f' |_{aio_task!r}\n'
)
@acm
async def translate_aio_errors(
chan: LinkedTaskChannel,
@ -985,38 +1131,25 @@ async def translate_aio_errors(
# if isinstance(chan._aio_err, AsyncioTaskExited):
# await tractor.pause(shield=True)
# if aio side is still active cancel it due to the trio-side
# error!
# if aio side is still active relay the trio-side error
# to it via `_fut_waiter.set_exception()`.
# ?TODO, mk `AsyncioCancelled[typeof(trio_err)]` embed the
# current exc?
if (
# not aio_task.cancelled()
# and
not aio_task.done() # TODO? only need this one?
# XXX LOL, so if it's not set it's an error !?
# yet another good jerb by `ascyncio`..
# and
# not aio_task.exception()
):
aio_taskc = TrioCancelled(
f'The `trio`-side task crashed!\n'
f'{trio_err}'
)
# ??TODO? move this into the func that tries to use
# `Task._fut_waiter: Future` instead??
#
# aio_task.set_exception(aio_taskc)
# wait_on_aio_task = False
try:
aio_task.set_exception(aio_taskc)
except (
asyncio.InvalidStateError,
RuntimeError,
# ^XXX, uhh bc apparently we can't use `.set_exception()`
# any more XD .. ??
):
delivered, report = maybe_signal_aio_task(
aio_task,
aio_taskc,
# so the relay carries a "<trio_err> -> caused ->
# TrioCancelled" chain when it eventually re-raises
# on the aio side.
cause=trio_err,
)
if not delivered:
wait_on_aio_task = False
log.cancel(report)
finally:
# record wtv `trio`-side error transpired
@ -1099,27 +1232,22 @@ async def translate_aio_errors(
if _py_313:
chan._to_aio.shutdown()
# XXX CRITICAL ordering: capture `_fut_waiter`
# BEFORE the checkpoint. `asyncio.Task._wakeup`
# clears `_fut_waiter = None` as part of wakeup,
# so re-reading after the checkpoint loses the
# ref even though the exc is still in-flight on
# the (now-`done()`) original fut. The helper
# uses `pre_captured_fut` to recover that.
pre_cp_fut: asyncio.Future|None = aio_task._fut_waiter
# pump this event-loop (well `Runner` but ya)
#
# TODO? is this actually needed?
# -[ ] theory is this let's the aio side error on
# next tick and then we sync task states from
# here onward?
# so the aio side can error on next tick and we
# sync task states from here onward.
await trio.lowlevel.checkpoint()
# TODO? factor the next 2 branches into a func like
# `try_terminate_aio_task()` and use it for the taskc
# case above as well?
fut: asyncio.Future|None = aio_task._fut_waiter
if (
fut
and
not fut.done()
):
# await tractor.pause()
if graceful_trio_exit:
fut.set_exception(
TrioTaskExited(
relay_exc = TrioTaskExited(
f'the `trio.Task` gracefully exited but '
f'its `asyncio` peer is not done?\n'
f')>\n'
@ -1128,31 +1256,30 @@ async def translate_aio_errors(
f'>>\n'
f' |_{aio_task!r}\n'
)
)
# TODO? should this need to exist given the equiv
# `TrioCancelled` equivalent in the be handler
# above??
else:
fut.set_exception(
TrioTaskExited(
relay_exc = TrioTaskExited(
f'The `trio`-side task crashed!\n'
f'{trio_err}'
)
delivered, signal_report = maybe_signal_aio_task(
aio_task,
relay_exc,
pre_captured_fut=pre_cp_fut,
# XXX historically this branch called
# `aio_task.cancel()` when `_fut_waiter`
# was None — required to actually terminate
# aio tasks that aren't parked on a poke-able
# future (e.g. the `aio_echo_server` loop in
# `test_echoserver_detailed_mechanics`). Opt
# into the fallback so we don't regress.
allow_cancel_fallback=True,
# carry the trio-side exc (if any) as the
# cause so the aio-side relay shows the
# real root-cause chain when re-raised.
cause=trio_err,
)
else:
aio_taskc_warn: str = (
f'\n'
f'MANUALLY Cancelling `asyncio`-task: {aio_task.get_name()}!\n\n'
f'**THIS CAN SILENTLY SUPPRESS ERRORS FYI\n\n'
)
# await tractor.pause()
report += aio_taskc_warn
# TODO XXX, figure out the case where calling this makes the
# `test_infected_asyncio.py::test_trio_closes_early_and_channel_exits`
# hang and then don't call it in that case!
#
aio_task.cancel(msg=aio_taskc_warn)
report += signal_report
log.warning(report)
@ -1161,10 +1288,11 @@ async def translate_aio_errors(
# `channel._aio_err/._trio_to_raise`) BEFORE calling
# `maybe_raise_aio_side_err()` below!
#
# XXX WARNING NOTE
# the `task.set_exception(aio_taskc)` call above MUST NOT
# EXCEPT or this WILL HANG!! SO, if you get a hang maybe step
# through and figure out why it erroed out up there!
# NOTE, `wait_on_aio_task` may have been flipped to `False`
# by `maybe_signal_aio_task()` above when delivery
# failed (e.g. `_fut_waiter is None`) — in that case we
# skip the wait since the aio task won't process our
# relay exc and `_aio_task_complete` may never set.
#
if wait_on_aio_task:
await chan._aio_task_complete.wait()
@ -1181,6 +1309,47 @@ async def translate_aio_errors(
- `run_task()`
'''
# ===== cross-loop cause-chain matrix =====
# How `(trio_err, aio_err, trio_to_raise)` resolve into ONE
# terminal `raise X [from Y]` (or an early `return`).
#
# legend (the possible `X` / `Y` operands):
# - trio_err : `chan._trio_err`, the trio-side exc.
# - aio_err : `chan._aio_err`, the aio-side exc.
# - trio_to_raise : `chan._trio_to_raise`, a tractor-chosen
# relay exc (`AsyncioCancelled`/`AsyncioTaskExited`).
# - raise_from : `trio_err if (aio_err is trio_to_raise)
# else aio_err` (the chosen `__cause__`).
# - relay-echo : an `aio_err` that is one of OUR OWN
# `TrioTaskExited|TrioCancelled` signals,
# synth'd + delivered to the aio-side by
# `maybe_signal_aio_task()`; its `__cause__`
# is ALREADY `trio_err`.
# - "(bare)" : raised with NO explicit `from` clause.
#
# this block (final-raise in `translate_aio_errors`):
# condition => raises from
# ----------------------------------- ------------- -----------
# not suppress_graceful_exits => trio_to_raise raise_from
# AsyncioTaskExited + trio Cancelled/None => return (aio-exit ignored)
# AsyncioTaskExited + trio EoC => trio_err (bare)
# AsyncioCancelled + trio Cancelled => return (co-cancel ignored)
# trio_to_raise match catch-all => trio_to_raise raise_from
# aio_err is relay-echo ◄── the GUARD => trio_err (bare)
# aio_err independent (real aio fail) => trio_err aio_err
# aio_err independent, no trio_err => aio_err (bare)
# only trio_err => trio_err (bare)
#
# sibling block (`signal_trio_when_done()`, the aio done-cb):
# AsyncioTaskExited relay-out => trio_to_raise aio_err
# plain aio_err re-raise => aio_err (__cause__ preset)
#
# INVARIANT: a relay-echo must NEVER become `trio_err.__cause__`
# (it's ALREADY caused-BY `trio_err`) → doing so would CYCLE
# (`trio_err ◄─► relay`). So the guard raises the root
# `trio_err` bare; the relay still keeps its own correct
# "relay ◄ trio_err" chain for any aio-side inspection.
# ===== / cross-loop cause-chain matrix =====
aio_err: BaseException|None = chan._aio_err
trio_to_raise: (
AsyncioCancelled|
@ -1237,6 +1406,32 @@ async def translate_aio_errors(
and
type(aio_err) is not AsyncioCancelled
):
# XXX, if `aio_err` is one of OUR OWN relay-signals
# (`TrioTaskExited`/`TrioCancelled`) that we delivered
# to the aio-side via `maybe_signal_aio_task()`, AND
# its `__cause__` already points back at `trio_err`,
# then it's just a derivative ECHO of the trio-side
# error, NOT an independent asyncio failure.
#
# Raising `trio_err from aio_err` here would invert
# (and cyclically tangle) the cause chain since the
# relay was itself caused-by `trio_err`:
#
# trio_err.__cause__ = aio_err (from `raise .. from`)
# aio_err.__cause__ = trio_err (set in `maybe_signal_aio_task`)
#
# So raise the REAL root `trio_err` alone; the relay's
# own `__cause__` chain still correctly reads
# "TrioTaskExited <- trio_err" for aio-side inspection.
if (
trio_err is not None
and
isinstance(aio_err, (TrioTaskExited, TrioCancelled))
and
aio_err.__cause__ is trio_err
):
raise trio_err
# always raise from any captured asyncio error
if trio_err:
raise trio_err from aio_err
@ -1353,19 +1548,22 @@ async def open_channel_from(
# a `Return`-msg for IPC ctxs)
aio_task: asyncio.Task = chan._aio_task
if not aio_task.done():
fut: asyncio.Future|None = aio_task._fut_waiter
if fut:
fut.set_exception(
# capture the in-flight trio-side exc (if any)
# so the relay's `__cause__` chain shows the
# real root cause when the aio task re-raises.
# `sys.exc_info()[1]` is non-`None` only when
# the `try` body raised (graceful exit -> None).
trio_exc: BaseException|None = sys.exc_info()[1]
_, report = maybe_signal_aio_task(
aio_task,
TrioTaskExited(
f'but the child `asyncio` task is still running?\n'
f'>>\n'
f' |_{aio_task!r}\n'
),
cause=trio_exc,
)
)
else:
# XXX SHOULD NEVER HAPPEN!
log.error("SHOULD NEVER GET HERE !?!?")
await tractor.pause(shield=True)
log.cancel(report)
else:
chan._to_trio.close()
@ -1602,6 +1800,7 @@ def run_as_asyncio_guest(
fute_err: BaseException|None = None
try:
out: Outcome = await asyncio.shield(trio_done_fute)
# out: Outcome = await trio_done_fute
# ^TODO still don't really understand why the `.shield()`
# is required ... ??
# https://docs.python.org/3/library/asyncio-task.html#asyncio.shield

View File

@ -36,4 +36,5 @@ from ._beg import (
)
from ._taskc import (
maybe_raise_from_masking_exc as maybe_raise_from_masking_exc,
start_or_cancel as start_or_cancel,
)

View File

@ -27,6 +27,9 @@ from types import (
TracebackType,
)
from typing import (
Any,
Awaitable,
Callable,
Type,
TYPE_CHECKING,
)
@ -293,5 +296,55 @@ async def maybe_raise_from_masking_exc(
if raise_unmasked:
raise exc_ctx from exc_match
async def start_or_cancel(
nursery: trio.Nursery,
async_fn: Callable[..., Awaitable[Any]],
*args,
name: object = None,
) -> Any:
'''
Like `trio.Nursery.start()` but DON'T mask an out-of-band
cancellation as a (lossy) startup failure.
`trio.Nursery.start()` raises a generic
`RuntimeError("child exited without calling
task_status.started()")` whenever the started task exits
BEFORE calling `task_status.started()` INCLUDING the very
common case where the child was cancelled out-of-band by an
*ancestor* cancel-scope erroring/cancelling. In that case the
original `trio.Cancelled` is swallowed and the caller is left
with an opaque, root-cause-detached `RuntimeError`.
This wrapper re-surfaces any ambient (effective, hence
ancestor-inclusive) cancellation via
`trio.lowlevel.checkpoint_if_cancelled()` so the real
`trio.Cancelled` (carrying trio's auto-generated reason which
points at the true root exc) propagates instead. Only when we
are NOT under cancellation is the "didn't call `.started()`"
`RuntimeError` a genuine startup-protocol bug worth surfacing,
so it's re-raised as-is in that case.
'''
try:
return await nursery.start(
async_fn,
*args,
name=name,
)
except RuntimeError as rte:
if (
rte.args
and
'started' in rte.args[0]
):
# re-raises the in-flight `trio.Cancelled` IFF we're
# under effective cancellation; else a cheap no-op and
# we fall through to re-raise the genuine startup RTE.
await trio.lowlevel.checkpoint_if_cancelled()
raise
else:
raise

View File

@ -25,6 +25,11 @@ Provides:
reaper + optional `/dev/shm/`
+ UDS sock-file sweeps.
alias for `scripts/tractor-reap`.
- `acli.watch [-n SEC] <alias-name> run a callable alias in
[alias-args]` an alt-screen loop with
flicker-free repaint
(cursor-home + per-line
EL + post-draw erase-down).
Loading from repo root:
xontrib load -p ./xontrib tractor_diag
@ -43,7 +48,16 @@ helpers) — these aliases are just thin terminal wrappers.
Requires `psutil` for full functionality (`ptree` and the
`hung_dump` tree-walk). Falls back to `pgrep -P` recursion if
missing.
"""
import os
import sys
import signal
import time
from typing import (
Callable,
)
from pathlib import Path
@ -55,10 +69,156 @@ from tractor._testing.trace import (
scan_bindspace,
)
@aliases.unthreadable
def watch(
args: list[str],
) -> int:
'''
A per-term optimized `watch`-like alias for xonsh
that runs an arbitrary callable alias in a loop
inside the alt-screen buffer. Ctrl-C returns to a
pristine shell, SIGWINCH triggers a full redraw,
and the per-frame draw uses cursor-home + per-line
EL + post-draw erase-down so the loop is flicker-
free even when individual lines shrink or grow
between frames.
usage: acli.watch [-n SEC] <alias-name>
[alias-args]...
Examples:
acli.watch acli.ptree pytest
acli.watch -n 1.0 acli.bindspace_scan piker
acli.watch acli.hung_dump pytest
Only callable aliases (Python functions registered
in `aliases`) are supported. Subprocess-style
aliases raise an error — wrap them in a thin
callable if you need watching.
Output capture: the watched alias's stdout is
redirected into a `StringIO` per frame so we can
post-process it (insert `\033[K` before each `\n`).
Aliases that write directly to `sys.stdout.buffer`
or `os.write(1, ...)` bypass capture; for those the
EL-fix won't apply but the loop still functions.
'''
import argparse, io
from contextlib import redirect_stdout
parser = argparse.ArgumentParser(
prog='acli.watch',
description=watch.__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
'-n', '--interval',
type=float,
default=0.3,
help='poll interval in seconds (default: 0.3)',
)
parser.add_argument(
'alias',
help='name of a registered xonsh callable alias',
)
parser.add_argument(
'alias_args',
nargs=argparse.REMAINDER,
help='args forwarded to the watched alias',
)
try:
ns = parser.parse_args(args)
except SystemExit as se:
return int(se.code) if se.code is not None else 0
raw = aliases.get(ns.alias)
if raw is None:
print(
f'[acli.watch] no such alias: {ns.alias!r}'
)
return 1
# xonsh stores callable aliases as a bare callable
# OR wraps them in `[fn, *preset_args]` (depending
# on registration path / version). Unwrap both.
fn: Callable|None = None
preset_args: list = []
if callable(raw):
fn = raw
elif (
isinstance(raw, list)
and raw
and callable(raw[0])
):
fn = raw[0]
preset_args = list(raw[1:])
if fn is None:
kind: str = type(raw).__name__
print(
f'[acli.watch] alias {ns.alias!r} is not a '
f'callable alias (got {kind}); '
f'subprocess-style aliases not supported'
)
return 1
_FD: int = sys.stdout.fileno()
need_full_clear: bool = False
def _on_winch(signum, frame):
nonlocal need_full_clear
need_full_clear = True
prev_winch = signal.signal(
signal.SIGWINCH,
_on_winch,
)
prev_sigint = signal.signal(
signal.SIGINT,
signal.default_int_handler,
)
os.write(_FD, b'\033[?1049h\033[?25l')
try:
while True:
buf = io.StringIO()
with redirect_stdout(buf):
fn(preset_args + ns.alias_args)
if need_full_clear:
os.write(_FD, b'\033[H\033[2J')
need_full_clear = False
else:
os.write(_FD, b'\033[H')
# `\033[K` (EL) before each newline erases
# any stale tail chars left by a longer
# prior-frame version of the same line.
text: str = buf.getvalue()
painted: bytes = (
text.replace('\n', '\033[K\n').encode()
)
os.write(_FD, painted)
os.write(_FD, b'\033[J')
time.sleep(ns.interval)
except KeyboardInterrupt:
pass
finally:
os.write(_FD, b'\033[?25h\033[?1049l')
signal.signal(signal.SIGWINCH, prev_winch)
signal.signal(signal.SIGINT, prev_sigint)
return 0
# --- ptree ----------------------------------------------------
def _ptree(args):
def _ptree(
args: list[str],
):
'''
psutil-backed proc tree; per-proc classification into
severity-ordered buckets so leaked / defunct procs
@ -69,6 +229,13 @@ def _ptree(args):
See `tractor._testing.trace.dump_proc_tree()` for the
bucket semantics + classification details.
To watch this live with flicker-free repaint
(alt-screen, per-line EL, SIGWINCH-aware):
.. code-block:: xonsh
acli.watch acli.ptree pytest
'''
flag_tree: bool = False
pos_args: list = []
@ -400,6 +567,7 @@ _TCLI_ALIASES: dict = {
'acli.bindspace_scan': _bindspace_scan,
'acli.dump_all': _dump_all_alias,
'acli.reap': _tractor_reap,
'acli.watch': watch,
}
for _name, _fn in _TCLI_ALIASES.items():