Precompute all last commit timestamps in on_files#116
Precompute all last commit timestamps in on_files#116kunickiaj wants to merge 1 commit intotimvink:masterfrom
Conversation
In parallel, precompute all last commit timestamps in on_files so that we can process more quickly. We need to do this when we have all the files so we can do the work in parallel, rather than on_page_markdown. This does not pre-compute for first commit timestamp.
|
Sorry for the very late reply, this project has not been a priority.. Very cool PR, 5.5x improvement is considerable! One problem I see however is using the They basically create a new And then they update the So this bit from the PR will need some more edge case handling: |
|
Another promising avenue might be to tweak git a bit, there are a couple of settings for large repos that might https://www.git-tower.com/blog/git-performance/ Have you tried something like that? Might be worth documenting in this plugin |
|
Yeah, we're well aware of all those git features to make monorepos less of a pain, but it is still incredibly slow. To be fair, when updating docs for a single project or two the time hit is probably still acceptable as the application CI is going to take longer in most cases -- but if doing a bulk update across many docs in the repo it's going to time out CI. (Not to mention the $ cost of longer running CI in general). |
|
I added a fallback, merged in #166 |
In parallel, precompute all last commit timestamps in on_files so that we can process more quickly. We need to do this when we have all the files so we can do the work in parallel, rather than on_page_markdown.
This does not pre-compute for first commit timestamp. Can significantly improve wall time ref: #115
Looking for some feedback on this approach. If this looks reasonable we can figure out support for the first commit timestamp as well as a way to configure parallelism. This currently takes the min of 10 or however many cpus are reported.
On an M1 Max Macbook Pro (8 performance, 2 efficiency cores) this resulted in a speed up of ~5.5x when processing a large monorepo that originally took 378 seconds down to 69 seconds. Tested on 78 markdown files rendered in a repo of approximately 700k commits and 500k files.