piker/.claude/skills/piker_profiling.md

8.2 KiB
Raw Permalink Blame History

Piker Profiling Subsystem Skill

Skill for using piker.toolz.profile.Profiler to measure performance across distributed actor systems.

Core Profiler API

Basic Usage

from piker.toolz.profile import (
    Profiler,
    pg_profile_enabled,
    ms_slower_then,
)

profiler = Profiler(
    msg='<description of profiled section>',
    disabled=False,  # IMPORTANT: enable explicitly!
    ms_threshold=0.0,  # show all timings, not just slow
)

# do work
some_operation()
profiler('step 1 complete')

# more work
another_operation()
profiler('step 2 complete')

# prints on exit:
# > Entering <description of profiled section>
#   step 1 complete: 12.34, tot:12.34
#   step 2 complete: 56.78, tot:69.12
# < Exiting <description of profiled section>, total: 69.12 ms

Default Behavior Gotcha

CRITICAL: Profiler is disabled by default in many contexts!

# BAD: might not print anything!
profiler = Profiler(msg='my operation')

# GOOD: explicit enable
profiler = Profiler(
    msg='my operation',
    disabled=False,  # force enable!
    ms_threshold=0.0,  # show all steps
)

Profiler Output Format

> Entering <msg>
  <label 1>: <delta_ms>, tot:<cumulative_ms>
  <label 2>: <delta_ms>, tot:<cumulative_ms>
  ...
< Exiting <msg>, total time: <total_ms> ms

Reading the output: - delta_ms = time since previous checkpoint - cumulative_ms = time since profiler creation - Final total = end-to-end time for entire profiled section

Profiling Distributed Systems

Piker runs across multiple processes (actors). Each actor has its own log output. To profile distributed operations:

1. Identify Actor Boundaries

Common piker actors: - pikerd - main daemon process - brokerd - broker connection actor - chart - UI/graphics actor - Client scripts - analysis/annotation clients

2. Add Profilers on Both Sides

Server-side (chart actor):

# piker/ui/_remote_ctl.py
@tractor.context
async def remote_annotate(ctx):
    async with ctx.open_stream() as stream:
        async for msg in stream:
            profiler = Profiler(
                msg=f'Batch annotate {n} gaps',
                disabled=False,
                ms_threshold=0.0,
            )

            # handle request
            result = await handle_request(msg)
            profiler('request handled')

            await stream.send(result)
            profiler('result sent')

Client-side (analysis script):

# piker/tsp/_annotate.py
async def markup_gaps(...):
    profiler = Profiler(
        msg=f'markup_gaps() for {n} gaps',
        disabled=False,
        ms_threshold=0.0,
    )

    await actl.redraw()
    profiler('initial redraw')

    # build specs
    specs = build_specs(gaps)
    profiler('built annotation specs')

    # IPC round-trip!
    result = await actl.add_batch(specs)
    profiler('batch IPC call complete')

    await actl.redraw()
    profiler('final redraw')

3. Correlate Timing Across Actors

Example output correlation:

Client console:

> Entering markup_gaps() for 1285 gaps
  initial redraw: 0.20ms, tot:0.20
  built annotation specs: 256.48ms, tot:256.68
  batch IPC call complete: 119.26ms, tot:375.94
  final redraw: 0.07ms, tot:376.02
< Exiting markup_gaps(), total: 376.04ms

Server console (chart actor):

> Entering Batch annotate 1285 gaps
  `np.searchsorted()` complete!: 0.81ms, tot:0.81
  `time_to_row` creation complete!: 98.45ms, tot:99.28
  created GapAnnotations item: 2.98ms, tot:102.26
< Exiting Batch annotate, total: 104.15ms

Analysis: - Total client time: 376ms - Server processing: 104ms - IPC overhead + client spec building: 272ms - Bottleneck: client-side spec building (256ms)

Profiling Patterns

Pattern: Function Entry/Exit

async def my_function():
    profiler = Profiler(
        msg='my_function()',
        disabled=False,
        ms_threshold=0.0,
    )

    step1()
    profiler('step1')

    step2()
    profiler('step2')

    # auto-prints on exit

Pattern: Loop Iterations

# DON'T profile inside tight loops (overhead!)
for i in range(1000):
    profiler(f'iteration {i}')  # NO!

# DO profile around loops
profiler = Profiler(msg='processing 1000 items')
for i in range(1000):
    process(item[i])
profiler('processed all items')

Pattern: Conditional Profiling

# only profile when investigating specific issue
DEBUG_REPOSITION = True

def reposition(self, array):
    if DEBUG_REPOSITION:
        profiler = Profiler(
            msg='GapAnnotations.reposition()',
            disabled=False,
        )

    # ... do work

    if DEBUG_REPOSITION:
        profiler('completed reposition')

Pattern: Teardown/Cleanup Profiling

try:
    # ... main work
    pass
finally:
    profiler = Profiler(
        msg='Annotation teardown',
        disabled=False,
        ms_threshold=0.0,
    )

    cleanup_resources()
    profiler('resources cleaned')

    close_connections()
    profiler('connections closed')

Integration with PyQtGraph

Some piker modules integrate with pyqtgraphs profiling:

from piker.toolz.profile import (
    Profiler,
    pg_profile_enabled,  # checks pyqtgraph config
    ms_slower_then,      # threshold from config
)

profiler = Profiler(
    msg='Curve.paint()',
    disabled=not pg_profile_enabled(),
    ms_threshold=ms_slower_then,
)

Common Use Cases

1. IPC Request/Response Timing

# Client side
profiler = Profiler(msg='Remote request')
result = await remote_call()
profiler('got response')

# Server side (in handler)
profiler = Profiler(msg='Handle request')
process_request()
profiler('request processed')

2. Batch Operation Optimization

profiler = Profiler(msg='Batch processing')

# collect items
items = collect_all()
profiler(f'collected {len(items)} items')

# vectorized operation
results = numpy_batch_op(items)
profiler('numpy op complete')

# build result dict
output = {k: v for k, v in zip(keys, results)}
profiler('dict built')

3. Startup/Initialization Timing

async def __aenter__(self):
    profiler = Profiler(msg='Service startup')

    await connect_to_broker()
    profiler('broker connected')

    await load_config()
    profiler('config loaded')

    await start_feeds()
    profiler('feeds started')

    return self

Debugging Performance Regressions

When profiler shows unexpected slowness:

  1. Add finer-grained checkpoints

    # was:
    result = big_function()
    profiler('big_function done')
    
    # now:
    profiler = Profiler(msg='big_function internals')
    step1 = part_a()
    profiler('part_a')
    step2 = part_b()
    profiler('part_b')
    step3 = part_c()
    profiler('part_c')
  2. Check for hidden iterations

    # looks simple but might be slow!
    result = array[array['time'] == timestamp]
    profiler('array lookup')
    
    # reveals O(n) scan per call
    for ts in timestamps:  # outer loop
        row = array[array['time'] == ts]  # O(n) scan!
  3. Isolate IPC from computation

    # was: can't tell where time is spent
    result = await remote_call(data)
    profiler('remote call done')
    
    # now: separate phases
    payload = prepare_payload(data)
    profiler('payload prepared')
    
    result = await remote_call(payload)
    profiler('IPC complete')
    
    parsed = parse_result(result)
    profiler('result parsed')

Performance Expectations

Typical timings to expect:

  • IPC round-trip (local actors): 1-10ms
  • NumPy binary search (10k array): <1ms
  • Dict building (1k items, simple): 1-5ms
  • Qt redraw trigger: 0.1-1ms
  • Scene item removal (100s items): 10-50ms

Red flags: - Linear array scan per item: 50-100ms+ for 1k items - Dict comprehension with struct array: 50-100ms for 1k - Individual Qt item creation: 5ms per item

References

  • piker/toolz/profile.py - Profiler implementation
  • piker/ui/_curve.py - FlowGraphic paint profiling
  • piker/ui/_remote_ctl.py - IPC handler profiling
  • piker/tsp/_annotate.py - Client-side profiling

Skill Maintenance

Update when: - New profiling patterns emerge - Performance expectations change - New distributed profiling techniques discovered - Profiler API changes


Last updated: 2026-01-31 Session: Batch gap annotation optimization