-
Notifications
You must be signed in to change notification settings - Fork 10.3k
fix(gatsby): persist pages between runs #28590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f6c1a86 to
aa421fe
Compare
| const nodeShardsScenarios = [ | ||
| { | ||
| numberOfNodes: 50, | ||
| simulatedNodeObjectSize: 5 * 1024 * 1024, | ||
| expectedNumberOfNodeShards: 1, | ||
| }, | ||
| { | ||
| numberOfNodes: 5, | ||
| simulatedNodeObjectSize: 0.6 * 1024 * 1024 * 1024, | ||
| expectedNumberOfNodeShards: 3, | ||
| }, | ||
| ] | ||
| const pageShardsScenarios = [ | ||
| { | ||
| numberOfPages: 50, | ||
| simulatedPageObjectSize: 10 * 1024 * 1024, | ||
| expectedNumberOfPageShards: 1, | ||
| }, | ||
| { | ||
| numberOfPages: 5, | ||
| simulatedPageObjectSize: 0.9 * 1024 * 1024 * 1024, | ||
| expectedNumberOfPageShards: 5, | ||
| }, | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to construct scenario matrix - and we check each combination. I didn't go over the board - primary purpose of this is that sharding for pages and nodes is separate and that one doesn't affect the other.
Something to think about is wether we want to add some edge cases here - like when number of nodes or pages is equal 0 - for now implementation for those is that we discard persisted state when either is 0 - it does make a whole lot of sense for nodes as core creates some nodes of their own even if user doesn't use any source plugins. It's a bit different for pages - gatsby doesn't create pages (well except for gatsby develop creating dev-404-page ...). I am not sure if it's worth handling this case more gracefully? Do we want to optimize for scenarios where there are 0 pages?
packages/gatsby/src/redux/persist.ts
Outdated
|
|
||
| const pages: Array<[string, IGatsbyPage]> = [].concat(...pagesChunks) | ||
|
|
||
| if (!pagesChunks.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially problematic - this will discard state if there are 0 pages - do we want to optimize for those cases? (I don't even know if gatsby actually "works" if there are 0 pages TBH)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When way to go about this is to add pagesCount to redux.rest part and compare it I guess? Cache will get invalidated anyway after upgrade due to pagesChunk.length being 0 so I think cache shape "version" could be handled gracefully
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there were no pages then it's probably no big deal to bust the cache anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per off-thread discussion. I think you can merge this PR in a patch without this check, if you wanted to and then add the check in a minor bump later.
packages/gatsby/src/redux/persist.ts
Outdated
|
|
||
| const pages: Array<[string, IGatsbyPage]> = [].concat(...pagesChunks) | ||
|
|
||
| if (!pagesChunks.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per off-thread discussion. I think you can merge this PR in a patch without this check, if you wanted to and then add the check in a minor bump later.
|
Additional note on this - this needs additional handling done for TODO item: garbage collect stateful pages in bootstrap |
65f3f72 to
eb6689b
Compare
|
If merged, we should revert #29431 |
…for updatedAt and snapshot testing)
…s per test scenario
eb6689b to
619041c
Compare
LekoArts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| if (maxSize > 500000) { | ||
| if (showMaxSizeWarning && maxSize > 500000) { | ||
| report.warn( | ||
| `The size of at least one page context chunk exceeded 500kb, which could lead to degraded performance. Consider putting less data in the page context.` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked about adding information about what page exceeds this limit, but that's for a follow-up (not in this PR)
LekoArts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌮
| currentPages.forEach(page => { | ||
| if ( | ||
| !page.isCreatedByStatefulCreatePages && | ||
| (shouldRunCreatePagesStatefully || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for commenting this issue but I'm trying to understand how the createPagesStatefully works as no detailed docs are provided.
What I understood is that GC is always done on pages and we need to call createPage in every incremental build.
As the name suggests, createPagesStatefully should prevent that, giving the page the "statefully" meaning that you have to manage the state of the page.
I've tried using this api on my project but GC is always done. Pages built by calling createPage in createPagesStatefully get deleted in incrememental builds if I don't regulary recreate them in createPagesStatefully.
After looking at the code i figured out that shouldRunCreatePagesStatefully is always true in "gatsby build" so even if !page.isCreatedByStatefulCreatePages returns false the OR condition allows this code to always delete pages if not updated.
Is this correct? What is the real purpose of createPagesStatefully then?
Thank you for your time.
Description
Approach in #28351 turned out to be unfeasble - we do need entire page object actually to preserve backward compatiblity for
DELETE_PAGEaction.To handle potential OOM issues when persisting state, I expanded state sharding during persistance from just
nodesto bothnodesand now persistedpages. Right now it's copied & pasted sharding done for nodes - I could have DRY it (I still can if reviewers will request it :) ), but didn't think DRYing it is worth adding mental overhead for added abstractions - thoughts?This PR build on work already done by @redabacha (in #28316) borrowing test change (unskip) already done (in #28351).
TODO:
Related Issues
fixes #28281
fixes #26520
[ch19791]