Skip to content

Conversation

@erezsh
Copy link
Member

@erezsh erezsh commented Jul 6, 2025

Fixes for PR #1506

@codecov
Copy link

codecov bot commented Jul 6, 2025

Codecov Report

Attention: Patch coverage is 92.98246% with 4 lines in your changes missing coverage. Please review.

Project coverage is 89.93%. Comparing base (87bb8ef) to head (2f286cc).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
lark/tree_matcher.py 84.21% 3 Missing ⚠️
lark/lark.py 90.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1540      +/-   ##
==========================================
+ Coverage   89.90%   89.93%   +0.02%     
==========================================
  Files          52       52              
  Lines        7942     7985      +43     
==========================================
+ Hits         7140     7181      +41     
- Misses        802      804       +2     
Flag Coverage Δ
unittests 89.93% <92.98%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds type annotations, enhances caching behavior, and enables grammar serialization.

  • Added type hints to TreeMatcher methods and fields, and improved error handling for postlex always-accept scenarios.
  • Introduced cache_grammar option in LarkOptions, enforced its coupling with cache, and updated cache file naming.
  • Made Grammar serializable and extended Lark’s serialization to include the grammar when cache_grammar=True.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
lark/tree_matcher.py Added List/Dict type hints, annotated methods, and refined error paths.
lark/load_grammar.py Inherited Grammar from Serialize and defined its __serialize_fields__.
lark/lark.py Added cache_grammar option, validation, cache filename logic, and serialization of grammar.
Comments suppressed due to low confidence (1)

lark/lark.py:549

  • Add tests covering the new grammar field in serialized data to ensure deserialization via Grammar.deserialize works correctly when cache_grammar=True.
        if 'grammar' in data:

self._parser_cache: Dict[str, earley.Parser] = {}

def _build_recons_rules(self, rules):
def _build_recons_rules(self, rules: List[Rule]):
Copy link

Copilot AI Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Add an explicit return type annotation for this generator method (e.g., -> Generator[Rule, None, None] or -> Iterator[Rule]) to improve readability and static analysis.

Copilot uses AI. Check for mistakes.
self.rule_defs = rule_defs
self.ignore = ignore

__serialize_fields__ = 'term_defs', 'rule_defs', 'ignore'
Copy link

Copilot AI Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Use a consistent container type for __serialize_fields__ (e.g., a list) to match the pattern in other classes and avoid confusion.

Suggested change
__serialize_fields__ = 'term_defs', 'rule_defs', 'ignore'
__serialize_fields__ = ['term_defs', 'rule_defs', 'ignore']

Copilot uses AI. Check for mistakes.
@erezsh erezsh changed the title Fixes for PR #1506 Fixes for PR #1506, which adds the option to cache the grammar definition Jul 6, 2025
Copy link
Member

@MegaIng MegaIng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see tests before approving this - the logic isn't quite trivial enough to not have edge cases I can't think of or to not be broken in a future "unrelated" code change. Otherwise looks good.

@erezsh
Copy link
Member Author

erezsh commented Jul 13, 2025

@MegaIng Added tests

NasalDaemon and others added 4 commits July 13, 2025 14:01
* Serialize Lark.grammar

* Lark option: cache_grammar = False

* Add documentation and error message to Reconstructor

* Move parser.grammar check deeper into TreeMatcher
@erezsh erezsh merged commit c2e2048 into master Jul 13, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants