Shorten SHM buffer key names to fit Darwin's 31
char filename limit by hashing the `fqme`/content
portion with `md5` and truncating to 8 hex chars.
Deats,
- `.fsp._api`: replace `piker.{actor}[{uuid}].{sym}`
format with `{uuid[:8]}_{hash}.fsp`
- `.tsp._history`: add `platform.system()` check to
conditionally shorten `.hist`/`.rt` keys on Darwin
while keeping the full key format on Linux
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
NOTE that this patch was orig by @dnks and broken out from a larger
commit which added unnecessary/out-of-scope changes we didn't end up
requiring.
Add `loglevel` param propagation across the data feed and sampling
subsystems to enable proper console log setup in downstream (distibuted)
subactor tasks. This ensures sampler and history-mgmt tasks receive the
same loglevel as their parent `.data.feed` tasks.
Deats,
- add `loglevel: str|None` param to `register_with_sampler()`,
`maybe_open_samplerd()`, and `open_sample_stream()`.
- pass `loglevel` through to `get_console_log()` in
`register_with_sampler()` with fallback to actor `loglevel`.
- use `partial()` in `allocate_persistent_feed()` to pass
`loglevel` to `manage_history()` at task-start.
- add `loglevel` param to `manage_history()` with default
`'warning'` and pass through to `open_sample_stream()` from there.
- capture `loglevel` var in `brokers.cli.search()` and pass to
`symbol_search()` call.
Also,
- drop blank lines in fn sigs for consistency with piker style.
- add debug bp in `open_feed()` when `loglevel != 'info'`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Change most sub-modules to use `get_logger(name=__name__)` for
per-leaf-module `log` instances vs previous subpkg-level/shared refs.
Primary changes,
- import `get_[console_]logger()` from top-level `piker.log` across leaf
mods.
- change any `<subsys>._util.log` logger-instances as well (though this
approach should no longer be used since it masks the endpoint module's
emissions.
Also,
- add a defaulted `loglevel: str` param to all `open_trade_dialog()`
endpoints, anticipating it being passed in by `.clearing`-engine.
- call `get_console_log(level=loglevel, name=__name__)` in each trade
dialog ep to enable per-`brokerd`-backend console writing.
- drop `get_logger` from `.brokers.__all__` exports
- fix type annotations: `str|None` vs `str | None`
- add TODOs for,
* comments in `._util` about multi-subsys logging
* `.accounting.__init__` about console log setup
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Add frame-gap detection when `frame_last_dt < end_dt_param` to
warn about potential venue closures or missing data during the
backfill loop in `start_backfill()`.
Deats,
- add `frame_last_dt < end_dt_param` check after frame recv
- log warnings with EST-converted timestamps for clarity
- add `await tractor.pause()` for REPL-investigation on gaps
- add TODO comment about venue closure hour checking
- capture `_until_was_none` walrus var for null-check clarity
- add `last_time` assertion for `time[-1] == next_end_dt`
- rename `_daterr` to `nodata` with `_nodata` capture
Also,
- import `pendulum.timezone` and create `est` tz instance
- change `get_logger()` import from `.data._util` to `.log`
- add parens around `(next_prepend_index - ln) < 0` check
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Add timeout-gated wait for `feed_is_live: trio.Event` before passing shm
tokens to `open_sample_stream()`; skip registering shm-buffers with the
sampler if the feed doesn't "go live" within a new timeout.
The main motivation here is to avoid the sampler incrementing shm-array
bufs when the mkt-venue is closed so that a trailing "same price"
line/bars isn't updated/rendered in the chart's view when unnecessary.
Deats,
- add `wait_for_live_timeout: float = 0.5` param to `manage_history()`
- warn-log the fqme when timeout triggers
- add error log for invalid `frame_start_dt` comparisons to
`maybe_fill_null_segments()`.
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Enable the previously commented-out dedupe-and-write logic in
`start_backfill()` to ensure tsdb stays clean of duplicate
entries.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Switch `.tsp._annotate.markup_gaps()` to use new
`.ui._style.get_fonts()` API for font size calc on client side and add
optional `show_txt: bool` flag to toggle gap duration labels (with
default `False`).
Also,
- replace `sgn` checks with named bools: `up_gap`, `down_gap`
- use `small_font.px_size - 1` for gap label font sizing
- wrap text creation in `if show_txt:` block
- update IPC handler to use `get_fonts()` vs direct `_font` import
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Change annot-ctl APIs to return `None` on failure instead of invalid
`aid`s. Server now sends `{'error': msg}` dict on failures, client
match-blocks handle gracefully.
Also,
- update return types: `.add_rect()`, `.add_arrow()`, `.add_text()`
now return `int|None`
- match on `{'error': str(msg)}` in client IPC receive blocks
- send error dicts from server on timestamp lookup failures
- add failure handling in `markup_gaps()` to skip bad rects
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Fix annotation misalignment during backfill by switching from
client-computed indices to server-side timestamp lookups against
current shm state. Store absolute coords on annotations and
reposition on viz redraws.
Lowlevel impl deats,
- add `time` param to `.add_arrow()`, `.add_text()`, `.add_rect()`
- lookup indices from shm via timestamp matching in IPC handlers
- force chart redraw before `markup_gaps()` annotation creation
- wrap IPC send/receive in `trio.fail_after(3)` for timeout when
server fails to respond, likely hangs on no-case-match/error.
- cache `_meth`/`_kwargs` on rects, `_abs_x`/`_abs_y` on arrows
- auto-reposition all annotations after viz reset in redraw cmd
Also,
- handle `KeyError` for missing timeframes in chart lookup
- return `-1` aid on annotation creation failures (lol oh `claude`..)
- reconstruct rect positions from timestamps + BGM offset logic
- log repositioned annotation counts on viz redraw
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Prevent `ValueError` from negative prepend index in
`start_backfill()` by checking buffer space before push
attempts. Truncate incoming frame if needed and stop gracefully
when buffer full.
Also,
- add pre-push capacity check with frame truncation logic
- stop backfill when `next_prepend_index <= 0`
- log warnings for capacity exceeded and buffer-full conditions
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Adjust `humanize_duration()` to show "3h" instead of "3.0h" when the
duration value is a whole number, making labels cleaner.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Expose font sizing control for `pg.TextItem` annotations thru the
annot-ctl API. Default to `_font.font.pixelSize() - 3` when no
size provided.
Also,
- thread `font_size` param thru IPC handler in `serve_rc_annots()`
- apply font via `QFont.setPixelSize()` on text item creation
- add `?TODO` note in `markup_gaps()` re using `conf.toml` value
- update `add_text()` docstring with font_size param desc
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Introduce `humanize_duration()` helper in `.tsp._annotate` to
convert seconds to short human-readable format (d/h/m/s). Extend
annot-ctl API with `add_text()` method for placing `pg.TextItem`
labels on charts.
Also,
- add duration labels on RHS of gap arrows in `markup_gaps()`
- handle text item removal in `rm_annot()` match block
- expose `TextItem` cmd in `serve_rc_annots()` IPC handler
- use `hcolor()` for named-to-hex color conversion
- set anchor positioning for up vs down gaps
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
For now by REPLing them and raising an RTE inside `.ib.feed` as well as
tracing any such cases that make it (from other providers) up to the
`.tsp._history` layer during null-segment backfilling.
Such that they're easier to spot when zoomed out, a similar color to the
`RectItem`s and also remote-controlled via the `AnnotCtl` api.
Deats,
- request an arrow per gap from `markup_gaps()` using a new
`.add_arrow()` meth, set the color, direction and alpha with
position always as the `iend`/close of the last valid bar.
- extend the `.ui._remote_ctl` subys to support the above,
* add a new `AnnotCtl.add_arrow()`.
* add the service-side IPC endpoint for a 'cmd': 'ArrowEditor'.
- add a new `rm_annot()` helper to ensure the right graphics removal
API is used by annotation type:
* `pg.ArrowItem` looks up the `ArrowEditor` and calls `.remove(annot).
* `pg.SelectRect` keeps with calling `.delete()`.
- global-ize an `_editors` table to enable the prior.
- add an explicit RTE for races on the chart-actor's `_dss` init.
Namely insertion writes which over-fill the shm buffer past the latest
tsdb sample via `.tsp._history.shm_push_in_between()`.
Deats,
- check earliest `to_push` timestamp and enter pause point if it's
earlier then the tsdb's `backfill_until_dt` stamp.
- requires actually passing the `backfill_until_dt: datetime` thru,
* `get_null_segs()`
* `maybe_fill_null_segments()`
* `shm_push_in_between()` (obvi XD)
Using `claude`, add a `.tsp._dedupe_smart` module that attemps "smarter"
duplicate bars by attempting to distinguish between erroneous bars
partially written during concurrent backfill race conditions vs.
**actual** data quality issues from historical providers.
Problem:
--------
Concurrent writes (live updates vs. backfilling) can result in create
duplicate timestamped ohlcv vars with different values. Some
potential scenarios include,
- a market live feed is cancelled during live update resulting in the
"last" datum being partially updated with all the ticks for the
time step.
- when the feed is rebooted during charting, the backfiller will not
finalize this bar since rn it presumes it should only fill data for
time steps not already in the tsdb storage.
Our current naive `.unique()` approach obvi keeps the incomplete bar
and a "smarter" approach is to compare the provider's final vlm
amount vs. the maybe-cancelled tsdb's bar; a higher vlm value from
the provider likely indicates the cancelled-during-live-write and
**not** a datum discrepancy from said data provider.
Analysis (with `claude`) of `zecusdt` data revealed:
- 1000 duplicate timestamps
- 999 identical bars (pure duplicates from 2022 backfill overlap)
- 1 volume-monotonic conflict (live partial vs backfill complete)
A soln from `claude` -> `tsp._dedupe_smart.dedupe_ohlcv_smart()`
which:
- sorts by vlm **before** deduplication and keep the most complete
bar based on vlm monotonicity as well as the following OHLCV
validation assumptions:
* volume should always increase
* high should be non-decreasing,
* low should be non-increasing
* open should be identical
- Separates valid race conditions from provider data quality issues
and reports and returns both dfs.
Change summary by `claude`:
- `.tsp._dedupe_smart`: new module with validation logic
- `.tsp.__init__`: expose `dedupe_ohlcv_smart()`
- `.storage.cli`: integrate smart dedupe, add logging for:
* duplicate counts (identical vs monotonic races)
* data quality violations (non-monotonic, invalid OHLC ranges)
* warnings for provider data issues
- Remove `assert not diff` (duplicates are valid now)
Verified on `zecusdt`: correctly keeps index 3143645
(volume=287.777) over 3143644 (volume=140.299) for
conflicting 2026-01-16 18:54 UTC bar.
`claude`'s Summary of reasoning
-------------------------------
- volume monotonicity is critical: a bar's volume only increases
during its time window.
- a backfilled bar should always have volume >= live updated.
- violations indicate any of:
* Provider data corruption
* Non-OHLCV aggregation semantics
* Timestamp misalignment
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
Polars tightened type safety for `.dt` accessor methods requiring
`total_*` methods for duration types vs datetime component accessors
like `day()` which now only work on datetime dtypes.
`detect_time_gaps()` in `.tsp._anal` was calling `.dt.day()`
on `dt_diff` column (a duration from `.diff()`) which throws
`InvalidOperationError` on modern polars.
Changes:
- use f-string to add pluralization to map time unit strings to
`total_<unit>s` form for the new duration API.
- Handle singular/plural forms: 'day' -> 'days' -> 'total_days'
- Ensure trailing 's' before applying 'total_' prefix
Also updates inline comments explaining the polars type distinction
between datetime components vs duration totals.
Fixes `piker store ldshm` crashes on datasets with time gaps.
(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
The `await feed_is_live.wait()` is more or less pointless and would only
cause slower startup afaig (as-far-as-i-grok) so i'm masking it here.
This also removes the final `strict_exception_groups=False` use from the
non-tests code base, flipping to the `tractor.trionics` collapser once
and for all!
Topically, throughout various (seemingly) console-UX-affecting or benign
spots in the code base; nothing that required more intervention beyond
things superficial. A few spots also include `trio.Nursery` ref renames
(always to something with a `tn` in it) and log-level reductions to
quiet (benign) console noise oriented around issues meant to be solved
long..
Note there's still a couple spots i left with the loose-ify flag because
i haven't fully tested them without using the latest version of
`tractor.trionics.collapse_eg()`, but more then likely they should flip
over fine.
Since I was trying out the neat lookin `polars-fuzzy-match` (also added
for now as a core dep here) which requires the new plugin sys, plus it's
about time we synced with upstream!
Adjust some column syntax to the new `.name` sub-field-space and the
`uv` lock-file to match.
Other,
- add back `trio-typing` bc i guess something else needs it (debug
tooling stuff in new `tractor`?)
- flip back to the `tractor` pre-main pin since the new `main`-branch
requires new `trio` stuff we haven't ported yet..
Namely handling backends which do not provide a default "frame
size-duration" in their init-config by making the backfiller guess the
value based on the first frame received.
Deats,
- adjust `start_backfill()` to take a more explicit
`def_frame_duration: Duration` expected to be unpacked from any
backend hist init-config by the `tsdb_backfill()` caller which now
also computes a value from the first received frame when the config
section isn't provided.
- in `start_backfill()` we now always expect the `def_frame_duration`
input and always decrement the query range by this value whenever
a `NoData` is raised by the provider-backend paired with an explicit
`log.warning()` about the handling.
- also relay any `DataUnavailable.args[0]` message from the provider
in the handler.
- repair "gap reporting" which checks for expected frame duration vs.
that received with much better humanized logging on the missing
segment using `pendulum.Interval/Duration.in_words()` output.
The `open_history_client()` provider endpoint can *optionally*
deliver a `frame_types: dict[int, pendulum.Duration]` subsection in its
`config: dict[str, dict]` (as was implemented with the `ib` backend).
This allows the `tsp` backfilling machinery to use this "recommended
frame duration" to subtract from the `last_start_dt` any time a `NoData`
gap is signalled by the `get_hist()` call allowing gaps to be ignored
safely without missing history by knowing the next earliest dt we can
query from using the `end_dt`. However, currently all crypto$ providers
haven't implemented this feat yet..
As such only try to use the `frame_types` feature if provided when
handling `NoData` conditions inside `tsp.start_backfill()` and otherwise
raise as normal.
Was orig for debugging an issue with `kucoin` i think but definitely
shouldn't be left in XD
Also add `'perpetual_future'` to the `.start_backfill()` input literal
set since we don't expect the 'btc/usd.perp.binance' for now.
Presuming the data provider gives us a config with a `frame_types: dict`
(indicating frame sizes per query/request) we try to be clever and
decrement our submitted `end_dt: DateTime` based on it.. hoping for the
best on the next attempt.
Thanks to oremanj in the `trio` room for this hot style tip which i much
prefer to have less LOC and places to change sub-pkg name exports!
Also drop expecting a `gaps` frame output from `dedupe()`.
Turns out we were always filtering to time gaps longer then a day smh..
Instead tweak `detect_time_gaps()` to only return venue-gaps when
a `gap_dt_unit: str` is passed and pass `'days'` (like it was by default
before) from `dedupe()` though we should really pass in an actual venue
gap duration in the future.
Got borked by the logic re-factoring to get more conc going around
tsdb vs. latest frame loads with nested nurseries. So, repair all that
such that we can still backfill symbols previously not loaded as well as
drop all the `_FeedBus` instance passing to subtasks where it's
definitely not needed.
Toss in a pause point around sampler stream `'backfilling'` msgs as well
since there's seems to be a weird ctx-cancelled propagation going on
when a feed client disconnects during backfill and this might be where
the src `tractor.ContextCancelled` is getting bubbled from?
Can't ref `dt_eps` and `tsdb_entry` if they don't exist.. like for 1s
sampling from `binance` (which dne). So make sure to add better logic
guard and only open the finaly backload nursery if we actually need to
fill the gap between latest history and where tsdb history ends.
TO CHERRY #486
Move `.data.history` -> `.tsp.__init__.py` for now as main pkg-mod
and `.data.tsp` -> `.tsp._anal` (for analysis).
Obviously follow commits will change surrounding codebase (imports) to
match..