Update Storm Simulation with NRTPerfTest#438
Update Storm Simulation with NRTPerfTest#438nipunbatra8 wants to merge 8 commits intomikemccand:mainfrom
Conversation
mikemccand
left a comment
There was a problem hiding this comment.
This looks great! Do you have any example segment traces you could attach for some eye-candy? I'm curious what War & Peace times look like w/ the default ConcurrentMergeScheduler...
Also, this tool (NRTPerfTest) is used by the nightly benchmark, so I'd rather make these new options opt-in so we don't change the behavior for nightlies, at least on first cut. It produces this chart (phew, 14+ years of data now!).
ytgu
left a comment
There was a problem hiding this comment.
Thanks for the changes!
+1 for making these features opt-in so that people can use them to simulate update storms when needed.
|
Here is the aggregate graph and the segment graph with the default Yes made it opt-in so it doesn't disturb the nightlies for now. |
|
Hi @nipunbatra8 -- is this still a draft PR? Or it's ready for merging after review? |
mikemccand
left a comment
There was a problem hiding this comment.
Thank you @nipunbatra8! I'm still wondering how we could fold this into nightly benchy. Have you opened a spinoff issue for that?
Oh yes, I see it! #459 Thanks.
|
Hi @mikemccand. Yes, let me write a comment instead that explains the specific config options instead of spreading it through the tool and then I will mark for review. |
|
Ready for merging! |
|
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution! |
Motivation
Add a reproducible “update storm” workload to study merge behavior and evaluate a potential bandwidth-capped merge scheduler. This is also intended for future contributors to benchmark and tune their own merge schedulers under realistic update-storm conditions.
Summary of changes
setDeletesPctAllowed(2.0)) for storm scenarios.InfoStreamto file for offline analysis and graphing.What we observe
Parameters varied (NRTPerfTest.java)
docsPerSec(ramp ×2/20s),numIndexThreads(≤ docsPerSec),runTimeSecnumSearchThreads,reopenPerSecsetDeletesPctAllowed(2.0)InfoStreamto fileOn-graph metrics tracked
Reproduce
python3 util/src/python/nrtPerf.py -source wikimediumall -dps 200 -rps 0.06 -nst 8 -nit 8 -rts 3000cd util/src/python/python3 -u infostream_to_segments.py ../../lucene-infostream.log test-output.pkpython3 -u segments_to_html.py test-output.pk out.htmlout.htmlorsegmetrics.html).