Skip to content

Optimization and cleanup in parse stream#18

Merged
c42f merged 11 commits intomainfrom
cjf/optimize-token-lookahead
Mar 4, 2022
Merged

Optimization and cleanup in parse stream#18
c42f merged 11 commits intomainfrom
cjf/optimize-token-lookahead

Conversation

@c42f
Copy link
Copy Markdown
Member

@c42f c42f commented Mar 4, 2022

This is a series of cleanup/optimization commits which

  • Reduce the size of data structures in ParseStream
  • Fix various JET warnings, particularly removing boxed closure captures and fixes to EMPTY_TOKEN in Tokenize
  • Pool and reuse some data structures during parsing
  • Better kind()/flags() API + compactify SyntaxToken flags
  • Remove need for storing newline flag on SyntaxToken
  • Optimize token lookahead and lookahead buffering

c42f added 11 commits March 1, 2022 15:35
This allows whitespace to be inspected in some special cases.
Define the combination of head/kind/flags functions as a more formal API
- many syntax nodes and token types need these.

On top of this we can define various predicates such as `is_trivia` in
one place rather than having multiple definitions of these functions.
The token_type() function no longer exists
* Use Ref to avoid triggering boxing of captures via closure variable
  assignment.
* Use let blocks for temporary ParseState transitions to avoid some
  closures.
This change implements a fast-path for token lookahead in peek() and
increases the size of the lookahead buffer to make this more efficient.
Manually track an index into the lookahead buffer to avoid buffer
resizing. (Julia's builtin array actually uses the same strategy to
avoid shuffling elements in popfront!(). But an extra layer here can
help as we know more about the data access.)
Some inbounds annotations in the `peek()` hot code paths seem to provide
a few percent improvement (maybe 5%)?
@c42f
Copy link
Copy Markdown
Member Author

c42f commented Mar 4, 2022

Overall this seems to boost the parser's speed by around 40% :-)

@c42f c42f merged commit a25e259 into main Mar 4, 2022
@c42f c42f deleted the cjf/optimize-token-lookahead branch March 4, 2022 07:10
c42f added a commit to JuliaLang/julia that referenced this pull request Oct 17, 2025
…-token-lookahead

Optimization and cleanup in parse stream
topolarity pushed a commit to JuliaLang/julia that referenced this pull request Nov 14, 2025
…-token-lookahead

Optimization and cleanup in parse stream
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant