Skip to content

Benchmark Haystack Pipeline #10894

@srini047

Description

@srini047

Currently there is no way to benchmark pipelines and its components. It's all either manual (intuition) or making use of logging and deriving results from traces/spans. Though this is fine but this still adds latency and misses the actual few ms due to network or function call. A native way to benchmark the pipelines is a good way to compare results and statistically derive valuable metrics out from your pipeline.

Describe the solution you'd like

  • Benchmark as part of Pipeline() itself
  • No external dependencies (using default Python methods)
  • Results should be both Pipeline level and per component level
  • Using percentiles instead of average as it provides more accurate and user-centric view of how the pipeline actually performs in the real world. So p50, p90, p99 are must including avg and total displayed.
  • Display the benchmark result in a user-friendly way.

Describe alternatives you've considered

  • Tracing the pipeline
  • Retrieving timestamp from tracer span.
  • Then computing per component and pipeline level metrics and then deduce the results.

This isn't a good DX and metrics won't match the actual world pipeline runs as well.

Additional context
Raising PR for review since I feel this is a good value addition to the haystack pipelines.

Metadata

Metadata

Assignees

Labels

P2Medium priority, add to the next sprint if no P1 available

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions