-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Reduce number of jump-stubs on ARM64 via smaller preserved space #63842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of this range gets consumed in a real-world ASP.NET app?
Note that we map all sorts of stuff into this range, including R2R images. I would expect that 128MB gets exhausted fairly quickly given how things work today.
I agree that this fix works well for micro-benchmarks that are very unlikely to exhaust the 128MB range.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will try to investigate, I guess the idea that the hot code (tier1) should better be closer to VM's FCalls by default?
For large apps I guess we still want to try to address your suggestion with emitting direct addresses without jump-stubs in tier1 #62302 (I have a draft)
For R2R-only it can be improved by pgo + method sorting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think R2R code today benefits from being close to coreclr as all 'external' calls/branches have to go through indirection cells anyway. This may have been different in the days of fragile ngen?
Please correct me if I'm wrong @jkotas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it is just for large apps. With TC enabled, all managed->managed method calls go through a precode that has the exact same instructions as jump stub and so it will introduce similar bottleneck as what you have identified.
R2R images are generally smaller than 128MB. You can only sort within the image, so the sorting won't help with jump stubs. (Sorting within image is still good for locality.)
Also, once we get this all fixed, we may want to look at retuning the inliner. My feeling is that the inliner expands the code too much these days. Some of it may be just compensating for the extra method call overhead that we are paying today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calls from runtime generated stubs and JITed code to R2R code still benefit from the two being close.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these not go through an indirection when tiering is enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - when tiering is enabled. No - when tiering is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll start from your suggestion to emit direct calls for T1 Caller calls T1 Callee (not as part of this PR)