feat: Differentiate single and multi valued tag operations. #104
feat: Differentiate single and multi valued tag operations. #104SharafMohamed merged 9 commits intoy-scope:mainfrom
Conversation
…; Add TaggedTransition.hpp to cmake.
…; Delete TaggedTransition file thats not used anymore.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis pull request updates several components of the NFA construction and testing workflow. In particular, method signatures and constructor calls in the NFA-related classes are modified to include new boolean parameters—such as for handling multi-valued transitions and repetition contexts. Additionally, a previously existing file containing tagged transition definitions has been removed. Test cases are refactored to use a common testing function, and a subproject commit reference is updated. Changes
Sequence Diagram(s)sequenceDiagram
participant LR as LexicalRule
participant RA as RegexAST
participant N as Nfa
participant NS as NfaState
participant TO as TagOperation
LR->>RA: add_to_nfa(nfa, end_state, false)
Note right of RA: Determines repetition context via descendent_of_repetition flag
RA->>N: add_to_nfa_with_negative_captures(nfa, end_state, flag)
N->>NS: Create new state (using multi_valued flag)
NS->>TO: add_spontaneous_transition(..., multi_valued)
Possibly related PRs
Suggested reviewers
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/log_surgeon/finite_automata/TagOperation.hpp (1)
43-52: Improved switch statement implementationThe code now directly assigns values to
type_charinstead of using a default case, which improves code clarity.However, you should consider using
false == m_negative_captures.empty()instead of!m_negative_captures.empty()to match the coding style used elsewhere in the codebase.src/log_surgeon/finite_automata/NfaState.hpp (1)
73-73: Updated TagOperation constructor callThe
TagOperationconstructor call now includes themulti_valuedparameter, correctly propagating the multi-valued state to the tag operations.Per your coding guidelines, consider using
false == stack.empty()instead of!stack.empty()on line 187 to maintain consistency in the codebase.src/log_surgeon/finite_automata/RegexAST.hpp (1)
74-85: Docstring clarifies repetition context
The expanded explanation for how repetition affects node processing is valuable. Consider using “descendant” (with an 'a') for maximum clarity.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
src/log_surgeon/LexicalRule.hpp(1 hunks)src/log_surgeon/finite_automata/Nfa.hpp(4 hunks)src/log_surgeon/finite_automata/NfaState.hpp(2 hunks)src/log_surgeon/finite_automata/RegexAST.hpp(18 hunks)src/log_surgeon/finite_automata/TagOperation.hpp(2 hunks)src/log_surgeon/finite_automata/TaggedTransition.hpp(0 hunks)tests/test-nfa.cpp(3 hunks)tools/yscope-dev-utils(1 hunks)
💤 Files with no reviewable changes (1)
- src/log_surgeon/finite_automata/TaggedTransition.hpp
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
tests/test-nfa.cppsrc/log_surgeon/LexicalRule.hppsrc/log_surgeon/finite_automata/TagOperation.hppsrc/log_surgeon/finite_automata/Nfa.hppsrc/log_surgeon/finite_automata/NfaState.hppsrc/log_surgeon/finite_automata/RegexAST.hpp
🧬 Code Graph Analysis (2)
src/log_surgeon/LexicalRule.hpp (1)
src/log_surgeon/finite_automata/RegexAST.hpp (10)
nfa(86-90)nfa(133-150)nfa(133-137)nfa(254-258)nfa(305-309)nfa(411-415)nfa(496-500)nfa(557-561)nfa(619-623)nfa(715-719)
src/log_surgeon/finite_automata/Nfa.hpp (1)
src/log_surgeon/finite_automata/NfaState.hpp (2)
dest_state(60-62)dest_state(60-60)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build (ubuntu-22.04, Release)
🔇 Additional comments (36)
tools/yscope-dev-utils (1)
1-1: Subproject Commit Identifier UpdatedThe commit identifier has been updated to
a2a363490690cd3728009859828caa2f742c636f. This change is a straightforward update to the subproject reference and does not affect the functionality of the tool.src/log_surgeon/LexicalRule.hpp (1)
47-47: Added missing parameter to function callThe addition of a third parameter (
false) to theadd_to_nfa_with_negative_capturesfunction call correctly implements the differentiation between single-valued and multi-valued tags. This change indicates that tags created during this operation are not descendants of a repetition and should be treated as single-valued.tests/test-nfa.cpp (5)
24-62: Good refactoring: created a test helper functionGreat job extracting common test logic into a reusable function. This refactoring improves code maintainability by reducing duplication and making it easier to add new test cases in the future.
64-83: Test case refactored appropriatelyThe test case has been successfully refactored to use the new
test_nfahelper function while maintaining the same test logic and expected outputs.
85-114: Complex test case refactored appropriatelyThe complex test case has been successfully refactored to use the new
test_nfahelper function without changing the test's intent or validation logic.
116-131: New test case for simple repetitionExcellent addition of a test case for simple repetition in regex patterns. This test validates the new multi-valued tag functionality when dealing with repetition operators (
*and+). Notice how the serialized output includes operations marked with+to indicate multi-valued tags.
133-149: New test case for complex repetitionGreat addition of a test case for complex repetition (nested repetition) which further validates the multi-valued tag functionality in more complex scenarios. This test provides good coverage for the new feature.
src/log_surgeon/finite_automata/TagOperation.hpp (3)
20-23: Constructor updated for multi-valued supportThe constructor has been appropriately updated to include the new
multi_valuedparameter, which aligns with the PR objectives of differentiating between single-valued and multi-valued tags.
37-37: Added accessor for multi-valued propertyGood addition of a getter method to access the multi-valued state of the tag operation. This follows good encapsulation practices.
58-58: Added new member variable for multi-valued functionalityThe new member variable
m_multi_valuedhas been added to store the multi-valued state of the tag operation, which is required for the new functionality.src/log_surgeon/finite_automata/NfaState.hpp (3)
53-54: Constructor parameter added for multi-valued supportThe constructor now accepts an additional parameter
multi_valuedto support the differentiation between single-valued and multi-valued tags. This aligns with the PR objectives.
57-57: Updated function call with multi-valued parameterThe call to
add_spontaneous_transitionhas been correctly updated to pass themulti_valuedparameter, ensuring consistent behavior throughout the class.
67-68: Method signature updated for multi-valued supportThe
add_spontaneous_transitionmethod has been updated to include themulti_valuedparameter. This change is consistent with the overall implementation of the multi-valued tag feature.src/log_surgeon/finite_automata/Nfa.hpp (7)
56-56: Documentation update looks good
The explanation for the new parameter is concise and aligns well with the existing style.
62-64: Multi-valued parameter introduction
This signature addition is coherent. Please confirm that all call sites referencing this function are updated accordingly.
69-69: Straightforward parameter specification
Themulti_valueddoc comment is clear and consistent with the rest of the documentation.
167-168: Validate the multi_valued usage for negative captures
Ensuring negative captures can also be multi-valued is crucial. Please confirm that tests account for multi-valued negative capture scenarios.Also applies to: 181-182
190-191: Extension of signature for positive captures
Introducingmulti_valuedfor paralleling negative captures is logical.
193-193: Inclusion of multi_valued in spontaneous transition calls
Propagatingmulti_valuedto transitions looks properly integrated. This enhances flexibility for capture tracking.Also applies to: 195-200
205-206: Uniform parameter usage in second state
Applyingmulti_valuedto the end state creation maintains consistency with the start state. Nicely handled.src/log_surgeon/finite_automata/RegexAST.hpp (16)
86-90: Function signature updated for repetition context
The new boolean parameter is properly introduced. Please verify that all derived classes implement it consistently.
120-132: Enhanced documentation for negative captures under repetition
These clarifications help in understanding how negative captures behave when repeated.
133-137: Extended documentation on add_to_nfa_with_negative_captures
Explaining repetition usage with negative captures aids in maintenance and clarity.
141-145: Ensuring correctness in new_state_from_negative_captures usage
Passingdescendent_of_repetitionhere matches the revised design. Looks good.
146-146: Fallback call to add_to_nfa
Continuing to the originaladd_to_nfamethod when there are no negative captures is correct and does not break existing flow.Also applies to: 148-148
212-214: add_to_nfa signature for RegexASTEmpty
No additional repetition-specific logic is needed here, which is consistent with an empty node’s behaviour.
254-258: RegexASTLiteral repetition parameter introduced
Handling a literal with the repetition flag does not pose issues given the current implementation.
305-309: Unsupported integer AST handling
Throwing an exception for integer AST nodes is in line with prior restrictions. No problems detected.
412-415: Extended signature for group node
Even though the code here does not overtly use the repetition flag, adding the parameter retains consistency.
497-500: Handling repetition in OR logic
The updated approach for negative captures under repetition is sound.
557-561: Concatenate node extended for repetition
Saving and restoringrootwhile introducing an intermediate state is a well-structured approach.
619-623: Repeat operator logic with descendent_of_repetition set
The code consistently passes repetition context through the multiplication logic.
715-719: Capture node updated to handle repetition
Forwarding the repetition flag into the capture’s NFA logic aligns well with the rest of the codebase.
930-934: Signature adjustment for capture AST
Introducing the repetition parameter ensures uniformity throughout all AST nodes.
966-970: Invoking new_start_and_end_states_from_positive_capture with repetition
The usage ofdescendent_of_repetitionfor multi_valued capturing is consistent with the code design.
975-976: Ensuring repetition flag is passed to nested capture AST
Chaining the repetition context down to nested calls maintains correctness for deeper captures.
Description
Previously all tags were considered to be multi-valued. This has performance implications as a fresh prefix tree needs to be created for every token during lexing:
Also performed some cleanup to the existing docstrings and consolidated repeated code in relavent unit-tests.
Validation performed
Summary by CodeRabbit