Skip to content

refactor: Create accepting NFA states using RAII.#84

Merged
davidlion merged 30 commits intoy-scope:mainfrom
SharafMohamed:raii-accepting-state
Feb 20, 2025
Merged

refactor: Create accepting NFA states using RAII.#84
davidlion merged 30 commits intoy-scope:mainfrom
SharafMohamed:raii-accepting-state

Conversation

@SharafMohamed
Copy link
Contributor

@SharafMohamed SharafMohamed commented Feb 3, 2025

Description

  • Instead of creating the NFA state and setting the accepting state parameters, everything is set during state creation for accepting states.
  • Ideally we could apply this to all fields for an NFA, and then we could create const NFA objects, but the transitions aren't known during state creation with the current code flow.

Validation performed

  • Existing tests pass.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Implemented a simplified approach for creating acceptance states, contributing to a more consistent pattern evaluation process.
    • Enhanced error handling during serialization to improve reliability.
  • Refactor

    • Consolidated multiple state-creation steps into single streamlined operations.
    • Redesigned transition handling using a new spontaneous model, optimising overall internal performance and maintainability.

@coderabbitai
Copy link

coderabbitai bot commented Feb 3, 2025

Walkthrough

The changes refactor state creation and transition handling in the NFA system. The LexicalRule class now calls a streamlined new_accepting_state method instead of multiple separate calls. In the NFA, new methods were introduced and existing methods renamed or removed to better manage accepting states and transitions. Additionally, the NFA state's API was restructured to use spontaneous transitions. Serialization now returns an optional string to indicate potential failures.

Changes

File(s) Change Summary
src/log_surgeon/LexicalRule.hpp Revised add_to_nfa to replace multiple state creation calls with a single call to nfa->new_accepting_state(m_variable_id). The negative capture logic remains unchanged.
src/log_surgeon/finite_automata/Nfa.hpp and src/log_surgeon/finite_automata/NfaState.hpp Nfa.hpp: Added new_accepting_state, renamed and removed methods to streamline state creation, and updated serialize to return std::optional<std::string>.
NfaState.hpp: Updated constructors, introduced spontaneous transitions, and revised state property accessors and transition methods.

Sequence Diagram(s)

sequenceDiagram
  participant LR as LexicalRule
  participant NFA as Nfa
  participant NS as NfaState

  LR->>NFA: add_to_nfa(nfa)
  NFA->>NS: new_accepting_state(m_variable_id)
  NS-->>NFA: Return new state
  NFA-->>LR: NFA updated with accepting state
Loading

Possibly related PRs

Suggested reviewers

  • LinZhihao-723
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kirkrodrigues
Copy link
Member

@coderabbitai configuration

@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2025

CodeRabbit configuration
language: en-CA
tone_instructions: ''
early_access: false
enable_free_tier: true
reviews:
  profile: chill
  request_changes_workflow: false
  high_level_summary: true
  high_level_summary_placeholder: '@coderabbitai summary'
  high_level_summary_in_walkthrough: false
  auto_title_placeholder: '@coderabbitai'
  auto_title_instructions: ''
  review_status: true
  commit_status: true
  fail_commit_status: false
  collapse_walkthrough: false
  changed_files_summary: true
  sequence_diagrams: true
  assess_linked_issues: true
  related_issues: true
  related_prs: true
  suggested_labels: true
  auto_apply_labels: false
  suggested_reviewers: true
  poem: false
  labeling_instructions: []
  path_filters: []
  path_instructions:
    - path: '**/*.{cpp,hpp,java,js,jsx,ts,tsx}'
      instructions: '- Prefer `false == <expression>` rather than `!<expression>`.'
  abort_on_close: true
  auto_review:
    enabled: true
    auto_incremental_review: true
    ignore_title_keywords: []
    labels: []
    drafts: false
    base_branches: []
  finishing_touches:
    docstrings:
      enabled: true
  tools:
    shellcheck:
      enabled: true
    ruff:
      enabled: true
    markdownlint:
      enabled: true
    github-checks:
      enabled: true
      timeout_ms: 90000
    languagetool:
      enabled: true
      enabled_only: false
      level: default
      enabled_rules: []
      disabled_rules: []
      enabled_categories: []
      disabled_categories: []
    biome:
      enabled: true
    hadolint:
      enabled: true
    swiftlint:
      enabled: true
    phpstan:
      enabled: true
      level: default
    golangci-lint:
      enabled: true
    yamllint:
      enabled: true
    gitleaks:
      enabled: true
    checkov:
      enabled: true
    detekt:
      enabled: true
    eslint:
      enabled: true
    rubocop:
      enabled: true
    buf:
      enabled: true
    regal:
      enabled: true
    actionlint:
      enabled: true
    pmd:
      enabled: true
    cppcheck:
      enabled: true
    semgrep:
      enabled: true
    circleci:
      enabled: true
    ast-grep:
      packages: []
      rule_dirs: []
      util_dirs: []
      essential_rules: true
chat:
  auto_reply: true
  integrations:
    jira:
      usage: auto
    linear:
      usage: auto
knowledge_base:
  opt_out: false
  web_search:
    enabled: true
  learnings:
    scope: auto
  issues:
    scope: auto
  jira:
    usage: auto
    project_keys: []
  linear:
    usage: auto
    team_keys: []
  pull_requests:
    scope: auto

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
CMakeLists.txt (1)

1-1: ⚠️ Potential issue

Update CMake minimum version to ensure future compatibility.

The current minimum version (3.5.1) will be incompatible with future CMake versions.

Apply this diff to update the minimum version:

-cmake_minimum_required(VERSION 3.5.1)
+cmake_minimum_required(VERSION 3.10)
🧰 Tools
🪛 GitHub Actions: Build

[warning] 1-1: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value.

🧹 Nitpick comments (4)
src/log_surgeon/finite_automata/NfaState.hpp (1)

92-92: Returning a bool by const reference may be unnecessary.
It is typically more common to return a simple bool by value. Returning bool const& can be confusing and rarely provides performance benefits.

- [[nodiscard]] auto is_accepting() const -> bool const& { return m_accepting; }
+ [[nodiscard]] auto is_accepting() const -> bool { return m_accepting; }
src/log_surgeon/finite_automata/TagOperation.hpp (1)

40-49: Add a default case to handle future enum values.

The switch statement should include a default case to handle any future additions to the TagOperationType enum.

     [[nodiscard]] auto serialize() const -> std::string {
         switch (m_type) {
             case TagOperationType::Set:
                 return fmt::format("{}{}", m_tag_id, "p");
             case TagOperationType::Negate:
                 return fmt::format("{}{}", m_tag_id, "n");
             case TagOperationType::None:
                 return "none";
+            default:
+                throw std::runtime_error("Unknown TagOperationType");
         }
     }
src/log_surgeon/finite_automata/SpontaneousTransition.hpp (1)

53-55: Consider a more idiomatic approach while maintaining the guideline.

While following the guideline for false == comparison, the code could be more idiomatic.

-    if (false == state_ids.contains(m_dest_state)) {
+    if (false == state_ids.contains(m_dest_state) || nullptr == m_dest_state) {
         return std::nullopt;
     }
tests/test-nfa.cpp (1)

24-81: Consider adding test cases for failure scenarios.

The test thoroughly covers the success path, but it would be beneficial to add test cases that verify:

  • Invalid NFA states
  • Serialization failures
  • Edge cases in the NFA structure
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6904e51 and 483bff2.

📒 Files selected for processing (11)
  • CMakeLists.txt (1 hunks)
  • src/log_surgeon/Lexer.hpp (1 hunks)
  • src/log_surgeon/LexicalRule.hpp (1 hunks)
  • src/log_surgeon/finite_automata/Dfa.hpp (1 hunks)
  • src/log_surgeon/finite_automata/Nfa.hpp (7 hunks)
  • src/log_surgeon/finite_automata/NfaState.hpp (6 hunks)
  • src/log_surgeon/finite_automata/RegexAST.hpp (4 hunks)
  • src/log_surgeon/finite_automata/SpontaneousTransition.hpp (1 hunks)
  • src/log_surgeon/finite_automata/TagOperation.hpp (1 hunks)
  • tests/CMakeLists.txt (1 hunks)
  • tests/test-nfa.cpp (2 hunks)
✅ Files skipped from review due to trivial changes (2)
  • src/log_surgeon/Lexer.hpp
  • src/log_surgeon/finite_automata/Dfa.hpp
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • src/log_surgeon/LexicalRule.hpp
  • src/log_surgeon/finite_automata/TagOperation.hpp
  • src/log_surgeon/finite_automata/Nfa.hpp
  • src/log_surgeon/finite_automata/RegexAST.hpp
  • src/log_surgeon/finite_automata/SpontaneousTransition.hpp
  • tests/test-nfa.cpp
  • src/log_surgeon/finite_automata/NfaState.hpp
🧠 Learnings (4)
CMakeLists.txt (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/TaggedTransition.hpp:16-37
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In `src/log_surgeon/finite_automata/TaggedTransition.hpp`, the classes `PositiveTaggedTransition` and `NegativeTaggedTransition` currently do not share enough functionality to justify refactoring into a common base class.
tests/CMakeLists.txt (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#47
File: src/log_surgeon/finite_automata/TaggedTransition.hpp:16-37
Timestamp: 2024-11-10T16:46:58.543Z
Learning: In `src/log_surgeon/finite_automata/TaggedTransition.hpp`, the classes `PositiveTaggedTransition` and `NegativeTaggedTransition` currently do not share enough functionality to justify refactoring into a common base class.
src/log_surgeon/finite_automata/RegexAST.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexAST.hpp:700-700
Timestamp: 2024-11-13T22:38:19.472Z
Learning: In `RegexASTCapture`, `m_tag` must always be non-null.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
🪛 GitHub Actions: Build
CMakeLists.txt

[warning] 1-1: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value.

tests/CMakeLists.txt

[warning] 1-1: Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument value.

src/log_surgeon/finite_automata/Nfa.hpp

[error] 141-141: Out-of-line definition of 'new_accepting_state' does not match any declaration in 'Nfa'.

🔇 Additional comments (19)
src/log_surgeon/finite_automata/NfaState.hpp (7)

4-4: No issues with the newly added headers.
They appear necessary for the introduced structures and functions, and they conform to typical practice.

Also applies to: 9-9, 13-13, 15-15, 18-18, 21-21, 24-24


40-40: Instantiation of default constructor seems fine.
No issues found.


42-44: Accepting state constructor is clearly defined.
This addition cleanly sets the state to accepting and the matching variable ID as needed.


46-52: Constructor with TagOperation logic is coherent.
It creates a new state and delegates to add_spontaneous_transition(...). Implementation looks correct.


54-65: Spontaneous transition addition is well-structured.
Reserving capacity for tag_ops and then moving into m_spontaneous_transitions is efficient and clear.


178-179: Epsilon closure logic is correct.
Pushing spontaneous transitions onto the stack properly ensures correct closure computation.


201-208: Optional check ensures safe serialization flow.
Returning std::nullopt early prevents partial or invalid output.

src/log_surgeon/finite_automata/Nfa.hpp (3)

6-6: Included headers look appropriate.
No redundant elements detected.

Also applies to: 15-15, 18-18, 20-20


206-208: Spontaneous transitions handled in BFS.
This complements the new transition mechanism nicely. No issues found.


224-229: Serializing states with optional return is robust.
The std::nullopt check is a sound approach for failure signalling.

src/log_surgeon/finite_automata/RegexAST.hpp (3)

24-24: New include for TagOperation is justified.
It corresponds with the usage of TagOperationType references.


114-127: Revised logic for negative captures is coherent.
Switching to spontaneous transitions aligns with the updated NFA model.


910-949: Capture group NFA construction is logically consistent.
The commented diagram clarifies spontaneous transitions for start and end tags. No immediate issues.

src/log_surgeon/LexicalRule.hpp (1)

43-46: Great use of RAII principles!

The refactoring improves state initialization by combining state creation and accepting state setup into a single operation, making the code more robust and less error-prone.

src/log_surgeon/finite_automata/TagOperation.hpp (1)

13-17: Well-structured enum class definition!

The use of an enum class with uint8_t as the underlying type is a good choice for type safety and memory efficiency.

src/log_surgeon/finite_automata/SpontaneousTransition.hpp (1)

23-31: Well-designed class with clear constructors!

The class design is clean with good use of move semantics in the constructor taking tag operations.

tests/test-nfa.cpp (1)

64-68: Good improvement in error handling!

The change to handle optional serialization result makes the test more robust by explicitly checking for successful serialization.

tests/CMakeLists.txt (1)

12-14:

Details

❓ Verification inconclusive

LGTM! Verify test coverage for the new transition model.

The changes reflect the transition from tagged to spontaneous transitions, aligning with the PR objectives.

Run the following script to verify test coverage for the new transition model:


🏁 Script executed:

#!/bin/bash
# Description: Check test coverage for the new transition model.

# Test: Search for test cases using SpontaneousTransition
rg -A 5 'SpontaneousTransition' test-nfa.cpp

# Test: Search for test cases using TagOperation
rg -A 5 'TagOperation' test-nfa.cpp

Length of output: 186


Action: Verify Comprehensive Test Coverage for the New Transition Model

The changes in tests/CMakeLists.txt correctly reflect the PR objectives by updating source file references from the tagged to the spontaneous transition model. However, the initial test coverage script targeted a non-existent file (test-nfa.cpp), as confirmed by the error output. This makes it unclear whether tests using SpontaneousTransition and TagOperation exist elsewhere in the repository.

  • Next Steps:
    • Manually verify that test cases covering the new transition model exist.

    • Alternatively, update the test coverage script to search across the entire tests directory. For instance:

      #!/bin/bash
      # Re-verify test coverage by searching all test files in the tests directory
      fd -e cpp tests | xargs rg -A 5 'SpontaneousTransition'
      fd -e cpp tests | xargs rg -A 5 'TagOperation'

Please ensure that the new transition model is fully exercised by existing tests or add appropriate tests if necessary.

CMakeLists.txt (1)

81-83: LGTM! Source files updated to reflect the new transition model.

The changes align with the PR objectives to refactor NFA states using RAII.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/log_surgeon/finite_automata/Nfa.hpp (1)

187-217: Consider error handling in traversal order.

The get_bfs_traversal_order() method assumes all operations will succeed. Since serialize() now handles failures, consider propagating potential errors from this method as well.

-auto Nfa<TypedNfaState>::get_bfs_traversal_order() const -> std::vector<TypedNfaState const*> {
+auto Nfa<TypedNfaState>::get_bfs_traversal_order() const -> std::optional<std::vector<TypedNfaState const*>> {
     // ... existing implementation ...
-    return visited_order;
+    return std::make_optional(visited_order);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 483bff2 and 4a00bdc.

📒 Files selected for processing (1)
  • src/log_surgeon/finite_automata/Nfa.hpp (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • src/log_surgeon/finite_automata/Nfa.hpp
🔇 Additional comments (5)
src/log_surgeon/finite_automata/Nfa.hpp (5)

6-6: LGTM! Well-organized header inclusions.

The new headers are properly organized following the standard convention: standard library headers first, followed by third-party headers, and then project-specific headers.

Also applies to: 15-15, 18-20


46-50: LGTM! Declaration matches previous review comment.

The declaration of new_accepting_state has been added as suggested in the previous review, resolving the build error.


87-87: LGTM! Improved error handling with std::optional.

The change to return std::optional<std::string> from serialize() allows for better error handling by explicitly indicating failure cases.


147-151: LGTM! Proper use of RAII pattern.

The implementation correctly initializes the accepting state with its matching variable ID during construction, following RAII principles.


219-237: LGTM! Robust error handling in serialization.

The implementation properly handles potential failures during state serialization by checking optional_serialized_state and propagating the failure.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
src/log_surgeon/finite_automata/TagOperation.hpp (1)

39-48: Add [[fallthrough]] attribute to switch cases.

Even though the cases end with return statements, it's good practice to add the [[fallthrough]] attribute to make the intention explicit and future-proof the code.

Apply this diff:

     switch (m_type) {
         case TagOperationType::Set:
+            [[fallthrough]];
             return fmt::format("{}{}", m_tag_id, "p");
         case TagOperationType::Negate:
+            [[fallthrough]];
             return fmt::format("{}{}", m_tag_id, "n");
         default:
+            [[fallthrough]];
             return fmt::format("{}{}", m_tag_id, "?");
src/log_surgeon/finite_automata/NfaState.hpp (1)

54-69: Reserve capacity for tag_ops vector.

Consider reserving capacity for the tag_ops vector to avoid potential reallocations.

Apply this diff:

     auto add_spontaneous_transition(
             TagOperationType const op_type,
             std::vector<tag_id_t> const& tag_ids,
             NfaState const* dest_state
     ) -> void {
         std::vector<TagOperation> tag_ops;
         tag_ops.reserve(tag_ids.size());
+        m_spontaneous_transitions.reserve(m_spontaneous_transitions.size() + 1);
         for (auto const tag_id : tag_ids) {
             tag_ops.emplace_back(tag_id, op_type);
         }
         m_spontaneous_transitions.emplace_back(std::move(tag_ops), dest_state);
src/log_surgeon/finite_automata/Nfa.hpp (2)

147-151: Improve const correctness in state creation methods.

Consider marking parameters as const references where appropriate to improve const correctness.

Apply this diff:

-    auto new_state_from_negative_captures(
+    auto new_state_from_negative_captures(
             std::vector<Capture const*> const& captures,
-            TypedNfaState const* dest_state
+            TypedNfaState const* const dest_state
     ) -> TypedNfaState*;

-    auto new_start_and_end_states_from_positive_capture(
+    auto new_start_and_end_states_from_positive_capture(
             Capture const* capture,
-            TypedNfaState const* dest_state
+            TypedNfaState const* const dest_state
     ) -> std::pair<TypedNfaState*, TypedNfaState*>;

Also applies to: 154-169, 172-184


187-217: Reserve capacity for serialized_states vector.

Consider reserving capacity for the serialized_states vector to avoid potential reallocations.

Apply this diff:

     std::vector<std::string> serialized_states;
+    serialized_states.reserve(traversal_order.size());
     for (auto const* state : traversal_order) {
         auto const optional_serialized_state{state->serialize(state_ids)};
         if (false == optional_serialized_state.has_value()) {

Also applies to: 220-237

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a00bdc and c29f6b9.

📒 Files selected for processing (7)
  • CMakeLists.txt (1 hunks)
  • src/log_surgeon/finite_automata/Nfa.hpp (7 hunks)
  • src/log_surgeon/finite_automata/NfaSpontaneousTransition.hpp (1 hunks)
  • src/log_surgeon/finite_automata/NfaState.hpp (6 hunks)
  • src/log_surgeon/finite_automata/RegexAST.hpp (4 hunks)
  • src/log_surgeon/finite_automata/TagOperation.hpp (1 hunks)
  • tests/CMakeLists.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/CMakeLists.txt
  • CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • src/log_surgeon/finite_automata/TagOperation.hpp
  • src/log_surgeon/finite_automata/NfaSpontaneousTransition.hpp
  • src/log_surgeon/finite_automata/Nfa.hpp
  • src/log_surgeon/finite_automata/NfaState.hpp
  • src/log_surgeon/finite_automata/RegexAST.hpp
🧠 Learnings (4)
src/log_surgeon/finite_automata/TagOperation.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#50
File: src/log_surgeon/finite_automata/Tag.hpp:0-0
Timestamp: 2024-11-18T16:45:46.074Z
Learning: The class `TagPositions` was removed from `src/log_surgeon/finite_automata/Tag.hpp` as it is no longer needed.
src/log_surgeon/finite_automata/Nfa.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
src/log_surgeon/finite_automata/RegexAST.hpp (2)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexAST.hpp:700-700
Timestamp: 2024-11-13T22:38:19.472Z
Learning: In `RegexASTCapture`, `m_tag` must always be non-null.
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build (ubuntu-22.04, Release)
  • GitHub Check: build (ubuntu-22.04, Debug)
  • GitHub Check: build (macos-latest, Release)
🔇 Additional comments (8)
src/log_surgeon/finite_automata/RegexAST.hpp (3)

24-24: LGTM!

The inclusion of TagOperation.hpp aligns with the refactoring to use RAII for NFA state creation.


864-866: LGTM! Improved efficiency and aligned with refactoring.

The changes improve efficiency by using pre-increment and align with the transition to spontaneous transitions.


911-943: LGTM! Well-documented transition to RAII-based state creation.

The changes effectively implement RAII for state creation while maintaining clear documentation of the NFA structure.

src/log_surgeon/finite_automata/TagOperation.hpp (2)

13-16: LGTM! Well-defined enum class.

The enum class is well-defined with clear operation types and appropriate underlying type.


18-35: LGTM! Well-structured class with efficient comparison operators.

The class is well-designed with appropriate member variables and efficient comparison operators using std::tie.

src/log_surgeon/finite_automata/NfaSpontaneousTransition.hpp (1)

23-62: LGTM! Well-designed template class with efficient operations.

The class is well-documented and efficiently implemented:

  • Constructor uses std::move for vector transfer
  • Serialization uses std::ranges::views::transform for efficient transformation
  • Boolean expressions follow coding guidelines
src/log_surgeon/finite_automata/NfaState.hpp (2)

40-52: LGTM! Well-structured constructors following RAII principles.

The changes effectively implement the transition from tagged to spontaneous transitions while maintaining RAII principles.

Also applies to: 117-117


172-188: LGTM! Well-implemented methods following coding guidelines.

The epsilon closure and serialization methods are well-implemented with proper error handling and boolean expression formatting.

Also applies to: 190-222

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/log_surgeon/finite_automata/Nfa.hpp (1)

114-115: Consider addressing the TODO comment about capture group naming.

The current limitation of enforcing unique naming across capture groups could be improved to support more use cases.

Would you like me to help design a solution that allows for non-unique capture group names while maintaining proper functionality?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c29f6b9 and 3c5de6a.

📒 Files selected for processing (3)
  • src/log_surgeon/LexicalRule.hpp (1 hunks)
  • src/log_surgeon/finite_automata/Nfa.hpp (2 hunks)
  • src/log_surgeon/finite_automata/NfaState.hpp (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/log_surgeon/LexicalRule.hpp
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,ts,tsx}`: - Prefer `false ==

**/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • src/log_surgeon/finite_automata/Nfa.hpp
  • src/log_surgeon/finite_automata/NfaState.hpp
🧠 Learnings (1)
src/log_surgeon/finite_automata/NfaState.hpp (1)
Learnt from: SharafMohamed
PR: y-scope/log-surgeon#48
File: src/log_surgeon/finite_automata/RegexNFAState.hpp:0-0
Timestamp: 2024-11-13T20:02:13.737Z
Learning: In `src/log_surgeon/finite_automata/RegexNFAState.hpp`, the constructor `RegexNFAState(std::set<Tag const*> tags, RegexNFAState const* dest_state)` has been updated to use `std::vector<Tag const*> tags` instead of `std::set`.
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: build (ubuntu-22.04, Release)
  • GitHub Check: build (ubuntu-22.04, Debug)
🔇 Additional comments (5)
src/log_surgeon/finite_automata/NfaState.hpp (3)

40-44: LGTM! Well-designed constructors using RAII.

The constructors properly initialize state properties at construction time, following RAII principles. The explicit keyword and initialization list are used appropriately.


177-178: LGTM! Follows coding guidelines.

The condition false == stack.empty() follows the preferred style guideline, and the DFS traversal implementation is correct.


209-210: LGTM! Good error handling.

The method properly uses std::optional for error handling and follows the coding guidelines for condition style.

src/log_surgeon/finite_automata/Nfa.hpp (2)

46-50: LGTM! Well-implemented accepting state creation.

The method properly leverages RAII through the constructor and is well-documented. The [[nodiscard]] attribute ensures the caller doesn't accidentally ignore the returned state.

Also applies to: 146-151


206-206:

Details

❓ Verification inconclusive

Address UTF-8 handling in traversal.

The TODO indicates that UTF-8 case handling is missing in the traversal. This could affect functionality when processing non-ASCII text.


🏁 Script executed:

#!/bin/bash
# Check if there are any UTF-8 specific test cases
rg -l "utf-?8" test/

Length of output: 68


Attention: Verify and address UTF-8 handling in traversal

Line 206 of src/log_surgeon/finite_automata/Nfa.hpp contains a TODO for handling the UTF-8 case. Our initial search using rg -l "utf-?8" in the test/ directory did not return any results, which suggests that there may not be dedicated tests verifying UTF-8 functionality. Please manually verify that non-ASCII text is processed correctly once added, and consider adding test cases to ensure robust UTF-8 support when this feature is implemented.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/log_surgeon/finite_automata/Nfa.hpp (1)

46-50: Enhance method documentation for clarity.

Consider expanding the documentation to clarify:

  • What an "accepting state" represents in the context of an NFA
  • The ownership semantics of the returned pointer (e.g., whether the NFA retains ownership)
     /*
+     * Creates a new accepting state in the NFA. An accepting state represents a final
+     * state that indicates successful pattern recognition.
      * @param matching_variable_id The id for the variable matched by this state.
-     * @return A pointer to the newly created accepting NFA state.
+     * @return A pointer to the newly created accepting NFA state. The NFA retains
+     *         ownership of the state through its internal vector of unique_ptrs.
      */
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c5de6a and 6535d53.

📒 Files selected for processing (1)
  • src/log_surgeon/finite_automata/Nfa.hpp (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ...

**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • src/log_surgeon/finite_automata/Nfa.hpp
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build (ubuntu-22.04, Release)
  • GitHub Check: build (ubuntu-22.04, Debug)
  • GitHub Check: build (macos-latest, Release)
🔇 Additional comments (3)
src/log_surgeon/finite_automata/Nfa.hpp (3)

146-151: LGTM! Implementation follows RAII principles.

The implementation correctly creates a new accepting state using RAII, maintaining consistency with the existing state creation pattern.


58-61: LGTM! Method names are more descriptive.

The renamed methods better convey their purpose and relationship to capture operations.

Also applies to: 72-75


219-237: LGTM! Improved error handling with std::optional.

The implementation now properly handles and propagates serialization failures, making error cases more explicit and safer to handle.

@davidlion davidlion merged commit 2dbbc95 into y-scope:main Feb 20, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants