piker/.claude/skills/piker_profiling.md

385 lines
8.2 KiB
Markdown

# Piker Profiling Subsystem Skill
Skill for using `piker.toolz.profile.Profiler` to measure
performance across distributed actor systems.
## Core Profiler API
### Basic Usage
```python
from piker.toolz.profile import (
Profiler,
pg_profile_enabled,
ms_slower_then,
)
profiler = Profiler(
msg='<description of profiled section>',
disabled=False, # IMPORTANT: enable explicitly!
ms_threshold=0.0, # show all timings, not just slow
)
# do work
some_operation()
profiler('step 1 complete')
# more work
another_operation()
profiler('step 2 complete')
# prints on exit:
# > Entering <description of profiled section>
# step 1 complete: 12.34, tot:12.34
# step 2 complete: 56.78, tot:69.12
# < Exiting <description of profiled section>, total: 69.12 ms
```
### Default Behavior Gotcha
**CRITICAL:** Profiler is disabled by default in many contexts!
```python
# BAD: might not print anything!
profiler = Profiler(msg='my operation')
# GOOD: explicit enable
profiler = Profiler(
msg='my operation',
disabled=False, # force enable!
ms_threshold=0.0, # show all steps
)
```
### Profiler Output Format
```
> Entering <msg>
<label 1>: <delta_ms>, tot:<cumulative_ms>
<label 2>: <delta_ms>, tot:<cumulative_ms>
...
< Exiting <msg>, total time: <total_ms> ms
```
**Reading the output:**
- `delta_ms` = time since previous checkpoint
- `cumulative_ms` = time since profiler creation
- Final total = end-to-end time for entire profiled section
## Profiling Distributed Systems
Piker runs across multiple processes (actors). Each actor has
its own log output. To profile distributed operations:
### 1. Identify Actor Boundaries
**Common piker actors:**
- `pikerd` - main daemon process
- `brokerd` - broker connection actor
- `chart` - UI/graphics actor
- Client scripts - analysis/annotation clients
### 2. Add Profilers on Both Sides
**Server-side (chart actor):**
```python
# piker/ui/_remote_ctl.py
@tractor.context
async def remote_annotate(ctx):
async with ctx.open_stream() as stream:
async for msg in stream:
profiler = Profiler(
msg=f'Batch annotate {n} gaps',
disabled=False,
ms_threshold=0.0,
)
# handle request
result = await handle_request(msg)
profiler('request handled')
await stream.send(result)
profiler('result sent')
```
**Client-side (analysis script):**
```python
# piker/tsp/_annotate.py
async def markup_gaps(...):
profiler = Profiler(
msg=f'markup_gaps() for {n} gaps',
disabled=False,
ms_threshold=0.0,
)
await actl.redraw()
profiler('initial redraw')
# build specs
specs = build_specs(gaps)
profiler('built annotation specs')
# IPC round-trip!
result = await actl.add_batch(specs)
profiler('batch IPC call complete')
await actl.redraw()
profiler('final redraw')
```
### 3. Correlate Timing Across Actors
**Example output correlation:**
**Client console:**
```
> Entering markup_gaps() for 1285 gaps
initial redraw: 0.20ms, tot:0.20
built annotation specs: 256.48ms, tot:256.68
batch IPC call complete: 119.26ms, tot:375.94
final redraw: 0.07ms, tot:376.02
< Exiting markup_gaps(), total: 376.04ms
```
**Server console (chart actor):**
```
> Entering Batch annotate 1285 gaps
`np.searchsorted()` complete!: 0.81ms, tot:0.81
`time_to_row` creation complete!: 98.45ms, tot:99.28
created GapAnnotations item: 2.98ms, tot:102.26
< Exiting Batch annotate, total: 104.15ms
```
**Analysis:**
- Total client time: 376ms
- Server processing: 104ms
- IPC overhead + client spec building: 272ms
- Bottleneck: client-side spec building (256ms)
## Profiling Patterns
### Pattern: Function Entry/Exit
```python
async def my_function():
profiler = Profiler(
msg='my_function()',
disabled=False,
ms_threshold=0.0,
)
step1()
profiler('step1')
step2()
profiler('step2')
# auto-prints on exit
```
### Pattern: Loop Iterations
```python
# DON'T profile inside tight loops (overhead!)
for i in range(1000):
profiler(f'iteration {i}') # NO!
# DO profile around loops
profiler = Profiler(msg='processing 1000 items')
for i in range(1000):
process(item[i])
profiler('processed all items')
```
### Pattern: Conditional Profiling
```python
# only profile when investigating specific issue
DEBUG_REPOSITION = True
def reposition(self, array):
if DEBUG_REPOSITION:
profiler = Profiler(
msg='GapAnnotations.reposition()',
disabled=False,
)
# ... do work
if DEBUG_REPOSITION:
profiler('completed reposition')
```
### Pattern: Teardown/Cleanup Profiling
```python
try:
# ... main work
pass
finally:
profiler = Profiler(
msg='Annotation teardown',
disabled=False,
ms_threshold=0.0,
)
cleanup_resources()
profiler('resources cleaned')
close_connections()
profiler('connections closed')
```
## Integration with PyQtGraph
Some piker modules integrate with `pyqtgraph`'s profiling:
```python
from piker.toolz.profile import (
Profiler,
pg_profile_enabled, # checks pyqtgraph config
ms_slower_then, # threshold from config
)
profiler = Profiler(
msg='Curve.paint()',
disabled=not pg_profile_enabled(),
ms_threshold=ms_slower_then,
)
```
## Common Use Cases
### 1. IPC Request/Response Timing
```python
# Client side
profiler = Profiler(msg='Remote request')
result = await remote_call()
profiler('got response')
# Server side (in handler)
profiler = Profiler(msg='Handle request')
process_request()
profiler('request processed')
```
### 2. Batch Operation Optimization
```python
profiler = Profiler(msg='Batch processing')
# collect items
items = collect_all()
profiler(f'collected {len(items)} items')
# vectorized operation
results = numpy_batch_op(items)
profiler('numpy op complete')
# build result dict
output = {k: v for k, v in zip(keys, results)}
profiler('dict built')
```
### 3. Startup/Initialization Timing
```python
async def __aenter__(self):
profiler = Profiler(msg='Service startup')
await connect_to_broker()
profiler('broker connected')
await load_config()
profiler('config loaded')
await start_feeds()
profiler('feeds started')
return self
```
## Debugging Performance Regressions
When profiler shows unexpected slowness:
1. **Add finer-grained checkpoints**
```python
# was:
result = big_function()
profiler('big_function done')
# now:
profiler = Profiler(msg='big_function internals')
step1 = part_a()
profiler('part_a')
step2 = part_b()
profiler('part_b')
step3 = part_c()
profiler('part_c')
```
2. **Check for hidden iterations**
```python
# looks simple but might be slow!
result = array[array['time'] == timestamp]
profiler('array lookup')
# reveals O(n) scan per call
for ts in timestamps: # outer loop
row = array[array['time'] == ts] # O(n) scan!
```
3. **Isolate IPC from computation**
```python
# was: can't tell where time is spent
result = await remote_call(data)
profiler('remote call done')
# now: separate phases
payload = prepare_payload(data)
profiler('payload prepared')
result = await remote_call(payload)
profiler('IPC complete')
parsed = parse_result(result)
profiler('result parsed')
```
## Performance Expectations
**Typical timings to expect:**
- IPC round-trip (local actors): 1-10ms
- NumPy binary search (10k array): <1ms
- Dict building (1k items, simple): 1-5ms
- Qt redraw trigger: 0.1-1ms
- Scene item removal (100s items): 10-50ms
**Red flags:**
- Linear array scan per item: 50-100ms+ for 1k items
- Dict comprehension with struct array: 50-100ms for 1k
- Individual Qt item creation: 5ms per item
## References
- `piker/toolz/profile.py` - Profiler implementation
- `piker/ui/_curve.py` - FlowGraphic paint profiling
- `piker/ui/_remote_ctl.py` - IPC handler profiling
- `piker/tsp/_annotate.py` - Client-side profiling
## Skill Maintenance
Update when:
- New profiling patterns emerge
- Performance expectations change
- New distributed profiling techniques discovered
- Profiler API changes
---
*Last updated: 2026-01-31*
*Session: Batch gap annotation optimization*