-
-
Notifications
You must be signed in to change notification settings - Fork 452
Fixes for PR #1506, which adds the option to cache the grammar definition #1540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1540 +/- ##
==========================================
+ Coverage 89.90% 89.93% +0.02%
==========================================
Files 52 52
Lines 7942 7985 +43
==========================================
+ Hits 7140 7181 +41
- Misses 802 804 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds type annotations, enhances caching behavior, and enables grammar serialization.
- Added type hints to
TreeMatchermethods and fields, and improved error handling for postlex always-accept scenarios. - Introduced
cache_grammaroption inLarkOptions, enforced its coupling withcache, and updated cache file naming. - Made
Grammarserializable and extendedLark’s serialization to include the grammar whencache_grammar=True.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| lark/tree_matcher.py | Added List/Dict type hints, annotated methods, and refined error paths. |
| lark/load_grammar.py | Inherited Grammar from Serialize and defined its __serialize_fields__. |
| lark/lark.py | Added cache_grammar option, validation, cache filename logic, and serialization of grammar. |
Comments suppressed due to low confidence (1)
lark/lark.py:549
- Add tests covering the new
grammarfield in serialized data to ensure deserialization viaGrammar.deserializeworks correctly whencache_grammar=True.
if 'grammar' in data:
| self._parser_cache: Dict[str, earley.Parser] = {} | ||
|
|
||
| def _build_recons_rules(self, rules): | ||
| def _build_recons_rules(self, rules: List[Rule]): |
Copilot
AI
Jul 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Add an explicit return type annotation for this generator method (e.g., -> Generator[Rule, None, None] or -> Iterator[Rule]) to improve readability and static analysis.
| self.rule_defs = rule_defs | ||
| self.ignore = ignore | ||
|
|
||
| __serialize_fields__ = 'term_defs', 'rule_defs', 'ignore' |
Copilot
AI
Jul 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Use a consistent container type for __serialize_fields__ (e.g., a list) to match the pattern in other classes and avoid confusion.
| __serialize_fields__ = 'term_defs', 'rule_defs', 'ignore' | |
| __serialize_fields__ = ['term_defs', 'rule_defs', 'ignore'] |
MegaIng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see tests before approving this - the logic isn't quite trivial enough to not have edge cases I can't think of or to not be broken in a future "unrelated" code change. Otherwise looks good.
|
@MegaIng Added tests |
Fixes for PR #1506