refactor: Create accepting NFA states using RAII. by SharafMohamed · Pull Request #84 · y-scope/log-surgeon

SharafMohamed · 2025-02-03T17:17:44Z

Description

Instead of creating the NFA state and setting the accepting state parameters, everything is set during state creation for accepting states.
Ideally we could apply this to all fields for an NFA, and then we could create const NFA objects, but the transitions aren't known during state creation with the current code flow.

Validation performed

Existing tests pass.

Summary by CodeRabbit

New Features
- Implemented a simplified approach for creating acceptance states, contributing to a more consistent pattern evaluation process.
- Enhanced error handling during serialization to improve reliability.
Refactor
- Consolidated multiple state-creation steps into single streamlined operations.
- Redesigned transition handling using a new spontaneous model, optimising overall internal performance and maintainability.

coderabbitai · 2025-02-03T17:17:57Z

Walkthrough

The changes refactor state creation and transition handling in the NFA system. The LexicalRule class now calls a streamlined new_accepting_state method instead of multiple separate calls. In the NFA, new methods were introduced and existing methods renamed or removed to better manage accepting states and transitions. Additionally, the NFA state's API was restructured to use spontaneous transitions. Serialization now returns an optional string to indicate potential failures.

Changes

File(s)	Change Summary
`src/log_surgeon/LexicalRule.hpp`	Revised `add_to_nfa` to replace multiple state creation calls with a single call to `nfa->new_accepting_state(m_variable_id)`. The negative capture logic remains unchanged.
`src/log_surgeon/finite_automata/Nfa.hpp` and `src/log_surgeon/finite_automata/NfaState.hpp`	Nfa.hpp: Added `new_accepting_state`, renamed and removed methods to streamline state creation, and updated `serialize` to return `std::optional<std::string>`. NfaState.hpp: Updated constructors, introduced spontaneous transitions, and revised state property accessors and transition methods.

Sequence Diagram(s)

sequenceDiagram
  participant LR as LexicalRule
  participant NFA as Nfa
  participant NS as NfaState

  LR->>NFA: add_to_nfa(nfa)
  NFA->>NS: new_accepting_state(m_variable_id)
  NS-->>NFA: Return new state
  NFA-->>LR: NFA updated with accepting state

Possibly related PRs

refactor: Update NFA and DFA headers to align with the latest coding guidelines. #58: The changes in the main PR are related to the introduction of the new_accepting_state method in the Nfa class, which is directly utilized in the modified add_to_nfa method of the LexicalRule class, indicating a strong connection at the code level.
refactor: Move epsilon_closure from Lexer to NfaState; Refactor nfa_to_dfa as Dfa's constructor. #71: The changes in the main PR are related to the introduction of the new_accepting_state method in the Nfa class, which is directly utilized in the modified add_to_nfa method of the LexicalRule class, while the retrieved PR involves moving the epsilon_closure method to the NfaState class, indicating a connection through the NFA's state management.
Constrain NFA states to contain only one negative transition. #46: The changes in the main PR, specifically the introduction of the new_accepting_state method in the Nfa class, are related to the modifications in the retrieved PR that focus on restructuring negative transitions in NFA states, as both involve significant alterations to state management within the NFA.

Suggested reviewers

LinZhihao-723

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

…hat result from the change.

kirkrodrigues · 2025-02-13T18:03:17Z

@coderabbitai configuration

coderabbitai · 2025-02-13T18:03:31Z

CodeRabbit configuration

language: en-CA
tone_instructions: ''
early_access: false
enable_free_tier: true
reviews:
  profile: chill
  request_changes_workflow: false
  high_level_summary: true
  high_level_summary_placeholder: '@coderabbitai summary'
  high_level_summary_in_walkthrough: false
  auto_title_placeholder: '@coderabbitai'
  auto_title_instructions: ''
  review_status: true
  commit_status: true
  fail_commit_status: false
  collapse_walkthrough: false
  changed_files_summary: true
  sequence_diagrams: true
  assess_linked_issues: true
  related_issues: true
  related_prs: true
  suggested_labels: true
  auto_apply_labels: false
  suggested_reviewers: true
  poem: false
  labeling_instructions: []
  path_filters: []
  path_instructions:
    - path: '**/*.{cpp,hpp,java,js,jsx,ts,tsx}'
      instructions: '- Prefer `false == <expression>` rather than `!<expression>`.'
  abort_on_close: true
  auto_review:
    enabled: true
    auto_incremental_review: true
    ignore_title_keywords: []
    labels: []
    drafts: false
    base_branches: []
  finishing_touches:
    docstrings:
      enabled: true
  tools:
    shellcheck:
      enabled: true
    ruff:
      enabled: true
    markdownlint:
      enabled: true
    github-checks:
      enabled: true
      timeout_ms: 90000
    languagetool:
      enabled: true
      enabled_only: false
      level: default
      enabled_rules: []
      disabled_rules: []
      enabled_categories: []
      disabled_categories: []
    biome:
      enabled: true
    hadolint:
      enabled: true
    swiftlint:
      enabled: true
    phpstan:
      enabled: true
      level: default
    golangci-lint:
      enabled: true
    yamllint:
      enabled: true
    gitleaks:
      enabled: true
    checkov:
      enabled: true
    detekt:
      enabled: true
    eslint:
      enabled: true
    rubocop:
      enabled: true
    buf:
      enabled: true
    regal:
      enabled: true
    actionlint:
      enabled: true
    pmd:
      enabled: true
    cppcheck:
      enabled: true
    semgrep:
      enabled: true
    circleci:
      enabled: true
    ast-grep:
      packages: []
      rule_dirs: []
      util_dirs: []
      essential_rules: true
chat:
  auto_reply: true
  integrations:
    jira:
      usage: auto
    linear:
      usage: auto
knowledge_base:
  opt_out: false
  web_search:
    enabled: true
  learnings:
    scope: auto
  issues:
    scope: auto
  jira:
    usage: auto
    project_keys: []
  linear:
    usage: auto
    team_keys: []
  pull_requests:
    scope: auto

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

CMakeLists.txt (1)
1-1: ⚠️ Potential issue

Update CMake minimum version to ensure future compatibility.

The current minimum version (3.5.1) will be incompatible with future CMake versions.

Apply this diff to update the minimum version:
-cmake_minimum_required(VERSION 3.5.1)
+cmake_minimum_required(VERSION 3.10)
🧰 Tools

🪛 GitHub Actions: Build

[warning] 1-1: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value.

🧹 Nitpick comments (4)

src/log_surgeon/finite_automata/NfaState.hpp (1)
92-92: Returning a bool by const reference may be unnecessary.
It is typically more common to return a simple bool by value. Returning bool const& can be confusing and rarely provides performance benefits.
- [[nodiscard]] auto is_accepting() const -> bool const& { return m_accepting; }
+ [[nodiscard]] auto is_accepting() const -> bool { return m_accepting; }
src/log_surgeon/finite_automata/TagOperation.hpp (1)
40-49: Add a default case to handle future enum values.

The switch statement should include a default case to handle any future additions to the TagOperationType enum.
     [[nodiscard]] auto serialize() const -> std::string {
         switch (m_type) {
             case TagOperationType::Set:
                 return fmt::format("{}{}", m_tag_id, "p");
             case TagOperationType::Negate:
                 return fmt::format("{}{}", m_tag_id, "n");
             case TagOperationType::None:
                 return "none";
+            default:
+                throw std::runtime_error("Unknown TagOperationType");
         }
     }
src/log_surgeon/finite_automata/SpontaneousTransition.hpp (1)
53-55: Consider a more idiomatic approach while maintaining the guideline.

While following the guideline for false == comparison, the code could be more idiomatic.
-    if (false == state_ids.contains(m_dest_state)) {
+    if (false == state_ids.contains(m_dest_state) || nullptr == m_dest_state) {
         return std::nullopt;
     }
tests/test-nfa.cpp (1)

24-81: Consider adding test cases for failure scenarios.

The test thoroughly covers the success path, but it would be beneficial to add test cases that verify:

Invalid NFA states

Serialization failures

Edge cases in the NFA structure

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6904e51 and 483bff2.

📒 Files selected for processing (11)

CMakeLists.txt (1 hunks)
src/log_surgeon/Lexer.hpp (1 hunks)
src/log_surgeon/LexicalRule.hpp (1 hunks)
src/log_surgeon/finite_automata/Dfa.hpp (1 hunks)
src/log_surgeon/finite_automata/Nfa.hpp (7 hunks)
src/log_surgeon/finite_automata/NfaState.hpp (6 hunks)
src/log_surgeon/finite_automata/RegexAST.hpp (4 hunks)
src/log_surgeon/finite_automata/SpontaneousTransition.hpp (1 hunks)
src/log_surgeon/finite_automata/TagOperation.hpp (1 hunks)
tests/CMakeLists.txt (1 hunks)
tests/test-nfa.cpp (2 hunks)

✅ Files skipped from review due to trivial changes (2)

src/log_surgeon/Lexer.hpp
src/log_surgeon/finite_automata/Dfa.hpp

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/LexicalRule.hpp
src/log_surgeon/finite_automata/TagOperation.hpp
src/log_surgeon/finite_automata/Nfa.hpp
src/log_surgeon/finite_automata/RegexAST.hpp
src/log_surgeon/finite_automata/SpontaneousTransition.hpp
tests/test-nfa.cpp
src/log_surgeon/finite_automata/NfaState.hpp

🧠 Learnings (4)

CMakeLists.txt (1)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/TaggedTransition.hpp:16-37
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In `src/log_surgeon/finite_automata/TaggedTransition.hpp`, the classes `PositiveTaggedTransition` and `NegativeTaggedTransition` currently do not share enough functionality to justify refactoring into a common base class.

tests/CMakeLists.txt (1)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/TaggedTransition.hpp:16-37
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In `src/log_surgeon/finite_automata/TaggedTransition.hpp`, the classes `PositiveTaggedTransition` and `NegativeTaggedTransition` currently do not share enough functionality to justify refactoring into a common base class.

src/log_surgeon/finite_automata/RegexAST.hpp (2)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexAST.hpp:700-700
Timestamp: 2024-11-13T22:38:19.472Z
Learning: In `RegexASTCapture`, `m_tag` must always be non-null.

src/log_surgeon/finite_automata/NfaState.hpp (1)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.

🪛 GitHub Actions: Build

CMakeLists.txt

[warning] 1-1: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value.

tests/CMakeLists.txt

[warning] 1-1: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value.

src/log_surgeon/finite_automata/Nfa.hpp

[error] 141-141: Out-of-line definition of 'new_accepting_state' does not match any declaration in 'Nfa'.

🔇 Additional comments (19)

src/log_surgeon/finite_automata/NfaState.hpp (7)

4-4: No issues with the newly added headers.
They appear necessary for the introduced structures and functions, and they conform to typical practice.

Also applies to: 9-9, 13-13, 15-15, 18-18, 21-21, 24-24

40-40: Instantiation of default constructor seems fine.
No issues found.

42-44: Accepting state constructor is clearly defined.
This addition cleanly sets the state to accepting and the matching variable ID as needed.

46-52: Constructor with TagOperation logic is coherent.
It creates a new state and delegates to add_spontaneous_transition(...). Implementation looks correct.

54-65: Spontaneous transition addition is well-structured.
Reserving capacity for tag_ops and then moving into m_spontaneous_transitions is efficient and clear.

178-179: Epsilon closure logic is correct.
Pushing spontaneous transitions onto the stack properly ensures correct closure computation.

201-208: Optional check ensures safe serialization flow.
Returning std::nullopt early prevents partial or invalid output.

src/log_surgeon/finite_automata/Nfa.hpp (3)

6-6: Included headers look appropriate.
No redundant elements detected.

Also applies to: 15-15, 18-18, 20-20

206-208: Spontaneous transitions handled in BFS.
This complements the new transition mechanism nicely. No issues found.

224-229: Serializing states with optional return is robust.
The std::nullopt check is a sound approach for failure signalling.

src/log_surgeon/finite_automata/RegexAST.hpp (3)

24-24: New include for TagOperation is justified.
It corresponds with the usage of TagOperationType references.

114-127: Revised logic for negative captures is coherent.
Switching to spontaneous transitions aligns with the updated NFA model.

910-949: Capture group NFA construction is logically consistent.
The commented diagram clarifies spontaneous transitions for start and end tags. No immediate issues.

src/log_surgeon/LexicalRule.hpp (1)

43-46: Great use of RAII principles!

The refactoring improves state initialization by combining state creation and accepting state setup into a single operation, making the code more robust and less error-prone.

src/log_surgeon/finite_automata/TagOperation.hpp (1)

13-17: Well-structured enum class definition!

The use of an enum class with uint8_t as the underlying type is a good choice for type safety and memory efficiency.

src/log_surgeon/finite_automata/SpontaneousTransition.hpp (1)

23-31: Well-designed class with clear constructors!

The class design is clean with good use of move semantics in the constructor taking tag operations.

tests/test-nfa.cpp (1)

64-68: Good improvement in error handling!

The change to handle optional serialization result makes the test more robust by explicitly checking for successful serialization.
tests/CMakeLists.txt (1)

12-14:
Details

❓ Verification inconclusive

LGTM! Verify test coverage for the new transition model.

The changes reflect the transition from tagged to spontaneous transitions, aligning with the PR objectives.

Run the following script to verify test coverage for the new transition model:

🏁 Script executed:
#!/bin/bash
# Description: Check test coverage for the new transition model.

# Test: Search for test cases using SpontaneousTransition
rg -A 5 'SpontaneousTransition' test-nfa.cpp

# Test: Search for test cases using TagOperation
rg -A 5 'TagOperation' test-nfa.cpp
Length of output: 186
Action: Verify Comprehensive Test Coverage for the New Transition Model

The changes in tests/CMakeLists.txt correctly reflect the PR objectives by updating source file references from the tagged to the spontaneous transition model. However, the initial test coverage script targeted a non-existent file (test-nfa.cpp), as confirmed by the error output. This makes it unclear whether tests using SpontaneousTransition and TagOperation exist elsewhere in the repository.
Next Steps:
Manually verify that test cases covering the new transition model exist.
Alternatively, update the test coverage script to search across the entire tests directory. For instance:
#!/bin/bash
# Re-verify test coverage by searching all test files in the tests directory
fd -e cpp tests | xargs rg -A 5 'SpontaneousTransition'
fd -e cpp tests | xargs rg -A 5 'TagOperation'
Please ensure that the new transition model is fully exercised by existing tests or add appropriate tests if necessary.
CMakeLists.txt (1)

81-83: LGTM! Source files updated to reflect the new transition model.

The changes align with the PR objectives to refactor NFA states using RAII.

src/log_surgeon/finite_automata/Nfa.hpp

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/log_surgeon/finite_automata/Nfa.hpp (1)
187-217: Consider error handling in traversal order.

The get_bfs_traversal_order() method assumes all operations will succeed. Since serialize() now handles failures, consider propagating potential errors from this method as well.
-auto Nfa<TypedNfaState>::get_bfs_traversal_order() const -> std::vector<TypedNfaState const*> {
+auto Nfa<TypedNfaState>::get_bfs_traversal_order() const -> std::optional<std::vector<TypedNfaState const*>> {
     // ... existing implementation ...
-    return visited_order;
+    return std::make_optional(visited_order);

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 483bff2 and 4a00bdc.

📒 Files selected for processing (1)

src/log_surgeon/finite_automata/Nfa.hpp (7 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/finite_automata/Nfa.hpp

🔇 Additional comments (5)

src/log_surgeon/finite_automata/Nfa.hpp (5)

6-6: LGTM! Well-organized header inclusions.

The new headers are properly organized following the standard convention: standard library headers first, followed by third-party headers, and then project-specific headers.

Also applies to: 15-15, 18-20

46-50: LGTM! Declaration matches previous review comment.

The declaration of new_accepting_state has been added as suggested in the previous review, resolving the build error.

87-87: LGTM! Improved error handling with std::optional.

The change to return std::optional<std::string> from serialize() allows for better error handling by explicitly indicating failure cases.

147-151: LGTM! Proper use of RAII pattern.

The implementation correctly initializes the accepting state with its matching variable ID during construction, following RAII principles.

219-237: LGTM! Robust error handling in serialization.

The implementation properly handles potential failures during state serialization by checking optional_serialized_state and propagating the failure.

Co-authored-by: Lin Zhihao <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

src/log_surgeon/finite_automata/TagOperation.hpp (1)

39-48: Add [[fallthrough]] attribute to switch cases.

Even though the cases end with return statements, it's good practice to add the [[fallthrough]] attribute to make the intention explicit and future-proof the code.

Apply this diff:

     switch (m_type) {
         case TagOperationType::Set:
+            [[fallthrough]];
             return fmt::format("{}{}", m_tag_id, "p");
         case TagOperationType::Negate:
+            [[fallthrough]];
             return fmt::format("{}{}", m_tag_id, "n");
         default:
+            [[fallthrough]];
             return fmt::format("{}{}", m_tag_id, "?");

src/log_surgeon/finite_automata/NfaState.hpp (1)

54-69: Reserve capacity for tag_ops vector.

Consider reserving capacity for the tag_ops vector to avoid potential reallocations.

Apply this diff:

     auto add_spontaneous_transition(
             TagOperationType const op_type,
             std::vector<tag_id_t> const& tag_ids,
             NfaState const* dest_state
     ) -> void {
         std::vector<TagOperation> tag_ops;
         tag_ops.reserve(tag_ids.size());
+        m_spontaneous_transitions.reserve(m_spontaneous_transitions.size() + 1);
         for (auto const tag_id : tag_ids) {
             tag_ops.emplace_back(tag_id, op_type);
         }
         m_spontaneous_transitions.emplace_back(std::move(tag_ops), dest_state);

src/log_surgeon/finite_automata/Nfa.hpp (2)

147-151: Improve const correctness in state creation methods.

Consider marking parameters as const references where appropriate to improve const correctness.

Apply this diff:

-    auto new_state_from_negative_captures(
+    auto new_state_from_negative_captures(
             std::vector<Capture const*> const& captures,
-            TypedNfaState const* dest_state
+            TypedNfaState const* const dest_state
     ) -> TypedNfaState*;

-    auto new_start_and_end_states_from_positive_capture(
+    auto new_start_and_end_states_from_positive_capture(
             Capture const* capture,
-            TypedNfaState const* dest_state
+            TypedNfaState const* const dest_state
     ) -> std::pair<TypedNfaState*, TypedNfaState*>;

Also applies to: 154-169, 172-184

187-217: Reserve capacity for serialized_states vector.

Consider reserving capacity for the serialized_states vector to avoid potential reallocations.

Apply this diff:

     std::vector<std::string> serialized_states;
+    serialized_states.reserve(traversal_order.size());
     for (auto const* state : traversal_order) {
         auto const optional_serialized_state{state->serialize(state_ids)};
         if (false == optional_serialized_state.has_value()) {

Also applies to: 220-237

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a00bdc and c29f6b9.

📒 Files selected for processing (7)

CMakeLists.txt (1 hunks)
src/log_surgeon/finite_automata/Nfa.hpp (7 hunks)
src/log_surgeon/finite_automata/NfaSpontaneousTransition.hpp (1 hunks)
src/log_surgeon/finite_automata/NfaState.hpp (6 hunks)
src/log_surgeon/finite_automata/RegexAST.hpp (4 hunks)
src/log_surgeon/finite_automata/TagOperation.hpp (1 hunks)
tests/CMakeLists.txt (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

tests/CMakeLists.txt
CMakeLists.txt

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/finite_automata/TagOperation.hpp
src/log_surgeon/finite_automata/NfaSpontaneousTransition.hpp
src/log_surgeon/finite_automata/Nfa.hpp
src/log_surgeon/finite_automata/NfaState.hpp
src/log_surgeon/finite_automata/RegexAST.hpp

🧠 Learnings (4)

src/log_surgeon/finite_automata/TagOperation.hpp (1)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#50
File: src/log_surgeon/finite_automata/Tag.hpp:0-0
Timestamp: 2024-11-18T16:45:46.074Z
Learning: The class `TagPositions` was removed from `src/log_surgeon/finite_automata/Tag.hpp` as it is no longer needed.

src/log_surgeon/finite_automata/Nfa.hpp (1)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.

src/log_surgeon/finite_automata/NfaState.hpp (1)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.

src/log_surgeon/finite_automata/RegexAST.hpp (2)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexAST.hpp:700-700
Timestamp: 2024-11-13T22:38:19.472Z
Learning: In `RegexASTCapture`, `m_tag` must always be non-null.

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: build (ubuntu-22.04, Release)
GitHub Check: build (ubuntu-22.04, Debug)
GitHub Check: build (macos-latest, Release)

🔇 Additional comments (8)

src/log_surgeon/finite_automata/RegexAST.hpp (3)

24-24: LGTM!

The inclusion of TagOperation.hpp aligns with the refactoring to use RAII for NFA state creation.

864-866: LGTM! Improved efficiency and aligned with refactoring.

The changes improve efficiency by using pre-increment and align with the transition to spontaneous transitions.

911-943: LGTM! Well-documented transition to RAII-based state creation.

The changes effectively implement RAII for state creation while maintaining clear documentation of the NFA structure.

src/log_surgeon/finite_automata/TagOperation.hpp (2)

13-16: LGTM! Well-defined enum class.

The enum class is well-defined with clear operation types and appropriate underlying type.

18-35: LGTM! Well-structured class with efficient comparison operators.

The class is well-designed with appropriate member variables and efficient comparison operators using std::tie.

src/log_surgeon/finite_automata/NfaSpontaneousTransition.hpp (1)

23-62: LGTM! Well-designed template class with efficient operations.

The class is well-documented and efficiently implemented:

Constructor uses std::move for vector transfer

Serialization uses std::ranges::views::transform for efficient transformation

Boolean expressions follow coding guidelines

src/log_surgeon/finite_automata/NfaState.hpp (2)

40-52: LGTM! Well-structured constructors following RAII principles.

The changes effectively implement the transition from tagged to spontaneous transitions while maintaining RAII principles.

Also applies to: 117-117

172-188: LGTM! Well-implemented methods following coding guidelines.

The epsilon closure and serialization methods are well-implemented with proper error handling and boolean expression formatting.

Also applies to: 190-222

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/log_surgeon/finite_automata/Nfa.hpp (1)

114-115: Consider addressing the TODO comment about capture group naming.

The current limitation of enforcing unique naming across capture groups could be improved to support more use cases.

Would you like me to help design a solution that allows for non-unique capture group names while maintaining proper functionality?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c29f6b9 and 3c5de6a.

📒 Files selected for processing (3)

src/log_surgeon/LexicalRule.hpp (1 hunks)
src/log_surgeon/finite_automata/Nfa.hpp (2 hunks)
src/log_surgeon/finite_automata/NfaState.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/log_surgeon/LexicalRule.hpp

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/finite_automata/Nfa.hpp
src/log_surgeon/finite_automata/NfaState.hpp

🧠 Learnings (1)

src/log_surgeon/finite_automata/NfaState.hpp (1)

Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: build (ubuntu-22.04, Release)
GitHub Check: build (ubuntu-22.04, Debug)

🔇 Additional comments (5)

src/log_surgeon/finite_automata/NfaState.hpp (3)

40-44: LGTM! Well-designed constructors using RAII.

The constructors properly initialize state properties at construction time, following RAII principles. The explicit keyword and initialization list are used appropriately.

177-178: LGTM! Follows coding guidelines.

The condition false == stack.empty() follows the preferred style guideline, and the DFS traversal implementation is correct.

209-210: LGTM! Good error handling.

The method properly uses std::optional for error handling and follows the coding guidelines for condition style.
src/log_surgeon/finite_automata/Nfa.hpp (2)

46-50: LGTM! Well-implemented accepting state creation.

The method properly leverages RAII through the constructor and is well-documented. The [[nodiscard]] attribute ensures the caller doesn't accidentally ignore the returned state.

Also applies to: 146-151

206-206:
Details

❓ Verification inconclusive

Address UTF-8 handling in traversal.

The TODO indicates that UTF-8 case handling is missing in the traversal. This could affect functionality when processing non-ASCII text.

🏁 Script executed:
#!/bin/bash
# Check if there are any UTF-8 specific test cases
rg -l "utf-?8" test/
Length of output: 68
Attention: Verify and address UTF-8 handling in traversal

Line 206 of src/log_surgeon/finite_automata/Nfa.hpp contains a TODO for handling the UTF-8 case. Our initial search using rg -l "utf-?8" in the test/ directory did not return any results, which suggests that there may not be dedicated tests verifying UTF-8 functionality. Please manually verify that non-ASCII text is processed correctly once added, and consider adding test cases to ensure robust UTF-8 support when this feature is implemented.

src/log_surgeon/finite_automata/Nfa.hpp

Co-authored-by: davidlion <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/log_surgeon/finite_automata/Nfa.hpp (1)
46-50: Enhance method documentation for clarity.

Consider expanding the documentation to clarify:

What an "accepting state" represents in the context of an NFA

The ownership semantics of the returned pointer (e.g., whether the NFA retains ownership)
     /*
+     * Creates a new accepting state in the NFA. An accepting state represents a final
+     * state that indicates successful pattern recognition.
      * @param matching_variable_id The id for the variable matched by this state.
-     * @return A pointer to the newly created accepting NFA state.
+     * @return A pointer to the newly created accepting NFA state. The NFA retains
+     *         ownership of the state through its internal vector of unique_ptrs.
      */

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c5de6a and 6535d53.

📒 Files selected for processing (1)

src/log_surgeon/finite_automata/Nfa.hpp (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ...

**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/log_surgeon/finite_automata/Nfa.hpp

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: build (ubuntu-22.04, Release)
GitHub Check: build (ubuntu-22.04, Debug)
GitHub Check: build (macos-latest, Release)

🔇 Additional comments (3)

src/log_surgeon/finite_automata/Nfa.hpp (3)

146-151: LGTM! Implementation follows RAII principles.

The implementation correctly creates a new accepting state using RAII, maintaining consistency with the existing state creation pattern.

58-61: LGTM! Method names are more descriptive.

The renamed methods better convey their purpose and relationship to capture operations.

Also applies to: 72-75

219-237: LGTM! Improved error handling with std::optional.

The implementation now properly handles and propagates serialization failures, making error cases more explicit and safer to handle.

SharafMohamed requested a review from LinZhihao-723 February 3, 2025 17:17

SharafMohamed mentioned this pull request Feb 3, 2025

fix(clang-tidy): Fix violations in files relevant to TDFA determinization. #85

Merged

SharafMohamed added 18 commits February 13, 2025 09:35

Change concept of tagged transitions to spontaenous transitions.

ad30714

Fix typo.

8d8e5ee

Return by reference; Fix docstring.

5c01aa3

Rename method to new_state_from_negative_captures for clarity.

f695c56

Update docstring of new_state_from_negative_captures.

fc1ef67

Rename to new_start_and_end_states_from_positive_capture.

dbd6d2c

Update spontaneous doc strings.

cd7e3a9

Update Nfa::Serialize docstring.

43c1764

Clean up NfaState; Fix double declaration of tag_id_t.

f42e341

Clean up Nfa.hpp.

563a619

Remove overload of add_spontaneous_transitions; Fix compiler errors t…

c55b5b3

…hat result from the change.

Use uniform initialization.

eb10e95

Clang-tidy.

ad23402

Add const.

452b6b4

Fix typo.

04a65c2

Add noexcept; Add None case.

abc1006

Fix docstring.

7967082

Remove noexcept.

9cafc39

coderabbitai bot reviewed Feb 14, 2025

View reviewed changes

src/log_surgeon/finite_automata/Nfa.hpp Show resolved Hide resolved

coderabbitai bot reviewed Feb 14, 2025

View reviewed changes

SharafMohamed and others added 5 commits February 14, 2025 13:05

Rename to NfaSpontaneousTransition.hpp.

01b4698

Removing unused constructor.

2a0d573

Add ticks for var in return field.

a454838

Co-authored-by: Lin Zhihao <[email protected]>

Add [nodiscard] to comparator methods.

472b674

Co-authored-by: Lin Zhihao <[email protected]>

Refactor serialize.

35fbcbd

SharafMohamed added 4 commits February 14, 2025 13:42

Remove None from TagOperationType; Add default case.

8a1fe9e

Change find to at.

3f8e6fb

Fix at call.

3c611ef

Fix cmake order.

9ccd118

coderabbitai bot reviewed Feb 14, 2025

View reviewed changes

Clang tidy fixes for TDFA related files.

c9464ab

SharafMohamed force-pushed the raii-accepting-state branch from 64327da to c9464ab Compare February 14, 2025 20:19

Merge branch 'main' into raii-accepting-state

3c5de6a

coderabbitai bot reviewed Feb 16, 2025

View reviewed changes

LinZhihao-723 requested a review from davidlion February 16, 2025 21:21

davidlion reviewed Feb 19, 2025

View reviewed changes

src/log_surgeon/finite_automata/Nfa.hpp Outdated Show resolved Hide resolved

Update src/log_surgeon/finite_automata/Nfa.hpp

6535d53

Co-authored-by: davidlion <[email protected]>

coderabbitai bot reviewed Feb 20, 2025

View reviewed changes

davidlion approved these changes Feb 20, 2025

View reviewed changes

davidlion merged commit 2dbbc95 into y-scope:main Feb 20, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Create accepting NFA states using RAII.#84

refactor: Create accepting NFA states using RAII.#84
davidlion merged 30 commits intoy-scope:mainfrom
SharafMohamed:raii-accepting-state

SharafMohamed commented Feb 3, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 3, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

kirkrodrigues commented Feb 13, 2025

Uh oh!

coderabbitai bot commented Feb 13, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SharafMohamed commented Feb 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Validation performed

Summary by CodeRabbit

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

kirkrodrigues commented Feb 13, 2025

Uh oh!

coderabbitai bot commented Feb 13, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SharafMohamed commented Feb 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)