-
Notifications
You must be signed in to change notification settings - Fork 83
refactor(clp): Prepare Grep code for deduplication with clp-s:
#1143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…to dictionary entries
…plit up changes" This reverts commit 70ef5e3.
…m clp specific implemantions.
WalkthroughThis change refactors and modularizes the query parsing and matching logic for log search functionality. It introduces new components ( Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant MainProgram
participant GrepCore
participant QueryToken
participant LogtypeDict
participant VarDict
User->>MainProgram: Provide search string
MainProgram->>GrepCore: process_raw_query(logtype_dict, var_dict, search_string, ...)
GrepCore->>QueryToken: Tokenize search string
QueryToken-->>GrepCore: Return tokenized representation
GrepCore->>LogtypeDict: Lookup logtype matches (if needed)
GrepCore->>VarDict: Lookup variable matches (if needed)
GrepCore-->>MainProgram: Return Query object with sub-queries
MainProgram->>Query: calculate_ids_of_matching_segments(get_segments_by_logtype_id, get_segments_by_var_id)
Query->>LogtypeDict: Get segments for logtype IDs
Query->>VarDict: Get segments for variable IDs
Query-->>MainProgram: Matching segment IDs
MainProgram-->>User: Present search results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (1)**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}⚙️ CodeRabbit Configuration File
Files:
🧠 Learnings (21)📓 Common learnings📚 Learning: in the clp project, when reviewing cmakelists.txt changes that introduce new compression library dep...Applied to files:
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, validation and exception throwing are unnecessary in ...Applied to files:
📚 Learning: in `components/core/src/clp/bufferedfilereader.cpp`, refactoring the nested error handling condition...Applied to files:
📚 Learning: in the clp project, antlr code generation at build time is being removed by another pr. when reviewi...Applied to files:
📚 Learning: in 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floa...Applied to files:
📚 Learning: in the clp codebase, standard headers like `` and `` in alphabetical order (as seen...Applied to files:
📚 Learning: in the `queryhandlerimpl.cpp` file, the `unique_projected_columns` set (using `std::string_view`) is...Applied to files:
📚 Learning: in the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible f...Applied to files:
📚 Learning: in the clp project, c++20 is used, allowing the utilization of c++20 features such as class template...Applied to files:
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, within the `get_archive_node_id` method, throwing a s...Applied to files:
📚 Learning: in `components/core/tests/test-error_handling.cpp` (c++), when specializing template methods of `err...Applied to files:
📚 Learning: in `clp-s.cpp`, the `run_serializer` function interleaves serialization and writing of ir files, mak...Applied to files:
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, when handling errors in `parse_from_ir`, prefer to ma...Applied to files:
📚 Learning: for the `queryhandlerimpl` class, reusing member variables like `m_ast_dfs_stack` is preferred over ...Applied to files:
📚 Learning: in the clp search component, the `evaluate_wildcard_filter` function should return `astevaluationres...Applied to files:
📚 Learning: double parentheses in the codebase are intentional and required for clang-tidy to pass. these should...Applied to files:
📚 Learning: double parentheses in require statements (e.g., `require((expression))`) and in conditional expressi...Applied to files:
📚 Learning: in log-surgeon, the `.` character is treated as a literal non-delimiter character rather than a rege...Applied to files:
📚 Learning: the `queryhandlerimpl` class in clp/ffi/ir_stream/search is not designed to be thread-safe and is as...Applied to files:
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, within the `parse_from_ir` method, encountering error...Applied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
🔇 Additional comments (9)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (3)
components/core/src/clp/clo/clo.cpp (1)
500-533: Well-structured refactoring that aligns with PR objectives.The changes correctly:
- Extract dictionaries from the archive reader
- Add implicit wildcards and clean the search string
- Use the new
GrepCore::process_raw_querywith explicit dictionary references- Calculate matching segments explicitly using lambda functions
Consider capturing the dictionaries by const reference in the lambdas to avoid potential copies:
- auto get_segments_containing_log_dict_id - = [&log_dict](logtype_dictionary_id_t logtype_id) -> std::set<segment_id_t> const& { + auto get_segments_containing_log_dict_id + = [&log_dict = std::as_const(log_dict)](logtype_dictionary_id_t logtype_id) -> std::set<segment_id_t> const& {components/core/src/clp/clg/clg.cpp (1)
215-258: Consistent refactoring that properly uses the new GrepCore interface.The changes correctly implement the new query processing pattern and use the
add_implicit_wildcardslambda for consistent wildcard handling.As with
clo.cpp, consider capturing dictionaries by const reference in the lambdas to ensure no accidental modifications:- auto get_segments_containing_log_dict_id - = [&log_dict]( - logtype_dictionary_id_t logtype_id - ) -> std::set<segment_id_t> const& { + auto get_segments_containing_log_dict_id + = [&log_dict = std::as_const(log_dict)]( + logtype_dictionary_id_t logtype_id + ) -> std::set<segment_id_t> const& {Also applies to: 511-511, 517-517
components/core/src/clp/QueryToken.hpp (1)
1-67: Well-designed QueryToken class that supports the refactoring goals.The class properly encapsulates token properties with a clean interface. The support for ambiguous tokens and type cycling aligns well with the PR objectives.
Consider adding documentation for the constructor parameters to clarify what
is_varrepresents:// Constructors + /** + * @param query_string The full query string containing this token + * @param begin_pos Start position of the token in the query string + * @param end_pos End position of the token in the query string + * @param is_var Whether this token was identified as a potential variable + */ QueryToken(std::string const& query_string, size_t begin_pos, size_t end_pos, bool is_var);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (17)
components/core/CMakeLists.txt(2 hunks)components/core/src/clp/EncodedVariableInterpreter.hpp(4 hunks)components/core/src/clp/Grep.cpp(1 hunks)components/core/src/clp/Grep.hpp(0 hunks)components/core/src/clp/GrepCore.cpp(1 hunks)components/core/src/clp/GrepCore.hpp(1 hunks)components/core/src/clp/Query.cpp(7 hunks)components/core/src/clp/Query.hpp(8 hunks)components/core/src/clp/QueryToken.cpp(1 hunks)components/core/src/clp/QueryToken.hpp(1 hunks)components/core/src/clp/StringReader.cpp(0 hunks)components/core/src/clp/clg/CMakeLists.txt(2 hunks)components/core/src/clp/clg/clg.cpp(5 hunks)components/core/src/clp/clo/CMakeLists.txt(2 hunks)components/core/src/clp/clo/clo.cpp(4 hunks)components/core/tests/test-Grep.cpp(3 hunks)taskfiles/lint.yaml(2 hunks)
💤 Files with no reviewable changes (2)
- components/core/src/clp/StringReader.cpp
- components/core/src/clp/Grep.hpp
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/tests/test-Grep.cppcomponents/core/src/clp/clo/clo.cppcomponents/core/src/clp/clg/clg.cppcomponents/core/src/clp/EncodedVariableInterpreter.hppcomponents/core/src/clp/Grep.cppcomponents/core/src/clp/GrepCore.cppcomponents/core/src/clp/GrepCore.hppcomponents/core/src/clp/QueryToken.cppcomponents/core/src/clp/QueryToken.hppcomponents/core/src/clp/Query.hppcomponents/core/src/clp/Query.cpp
🧠 Learnings (13)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/clp/FileCompressor.hpp:58-78
Timestamp: 2024-10-24T14:25:17.978Z
Learning: When reviewing legacy code refactors, avoid suggesting changes that would extend the scope of the PR.
taskfiles/lint.yaml (6)
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: anlowee
PR: #925
File: taskfiles/deps/main.yaml:97-106
Timestamp: 2025-05-28T18:33:30.155Z
Learning: In the taskfiles dependency system (taskfiles/deps/main.yaml), echo commands are used to generate .cmake settings files that are consumed by the main CMake build process. These files set variables like ANTLR_RUNTIME_HEADER to point to dependency locations for use during compilation.
Learnt from: jackluo923
PR: #1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like <cstdint> and <string> in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: anlowee
PR: #925
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-19
Timestamp: 2025-05-26T18:39:51.727Z
Learning: In CMakeLists.txt files, anlowee prefers to explicitly list source files one by one rather than using file(GLOB ...) or file(GLOB_RECURSE ...) to automatically include files, maintaining consistency with other CMake files in the project.
components/core/CMakeLists.txt (10)
Learnt from: anlowee
PR: #925
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-19
Timestamp: 2025-05-26T18:39:51.727Z
Learning: In CMakeLists.txt files, anlowee prefers to explicitly list source files one by one rather than using file(GLOB ...) or file(GLOB_RECURSE ...) to automatically include files, maintaining consistency with other CMake files in the project.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/test/utils.hpp:15-18
Timestamp: 2025-05-02T22:11:55.711Z
Learning: The team plans to systematically improve include paths by adding ${PROJECT_SOURCE_DIR}/src to CMake targets and using direct include paths instead of deep relative ones in a future PR.
Learnt from: jackluo923
PR: #1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like <cstdint> and <string> in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/clp-s.cpp:196-265
Timestamp: 2024-10-07T20:10:08.254Z
Learning: In clp-s.cpp, the run_serializer function interleaves serialization and writing of IR files, making it difficult to restructure it into separate functions.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/clp-s.cpp:196-265
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In clp-s.cpp, the run_serializer function interleaves serialization and writing of IR files, making it difficult to restructure it into separate functions.
Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The QueryHandlerImpl class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
components/core/src/clp/clo/CMakeLists.txt (8)
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: anlowee
PR: #925
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-19
Timestamp: 2025-05-26T18:39:51.727Z
Learning: In CMakeLists.txt files, anlowee prefers to explicitly list source files one by one rather than using file(GLOB ...) or file(GLOB_RECURSE ...) to automatically include files, maintaining consistency with other CMake files in the project.
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: jackluo923
PR: #1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like <cstdint> and <string> in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/test/utils.hpp:15-18
Timestamp: 2025-05-02T22:11:55.711Z
Learning: The team plans to systematically improve include paths by adding ${PROJECT_SOURCE_DIR}/src to CMake targets and using direct include paths instead of deep relative ones in a future PR.
Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: LinZhihao-723
PR: #558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file components/core/tests/test-ffi_KeyValuePairLogEvent.cpp, including <json/single_include/nlohmann/json.hpp> is consistent with the project's coding standards.
components/core/tests/test-Grep.cpp (8)
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: LinZhihao-723
PR: #570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (components/core/tests/test-ir_encoding_methods.cpp), exception handling for msgpack::unpack can be omitted because the Catch2 testing framework captures exceptions if they occur.
Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: LinZhihao-723
PR: #557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case ffi_ir_stream_serialize_schema_tree_node_id in test-ir_encoding_methods.cpp, suppressing the readability-function-cognitive-complexity warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Learnt from: LinZhihao-723
PR: #558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file components/core/tests/test-ffi_KeyValuePairLogEvent.cpp, including <json/single_include/nlohmann/json.hpp> is consistent with the project's coding standards.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In components/core/src/clp_s/JsonParser.cpp, when handling errors in parse_from_ir, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In components/core/tests/test-clp_s-end_to_end.cpp, the success of constructor.store() is verified through REQUIRE statements and subsequent comparisons.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-03-18T07:27:54.738Z
Learning: Double parentheses in REQUIRE statements (e.g., REQUIRE((expression))) and in conditional expressions are intentional and required for clang-tidy to pass. These should not be flagged as style issues in code reviews.
components/core/src/clp/clo/clo.cpp (18)
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:320-324
Timestamp: 2025-05-05T01:12:18.561Z
Learning: In the CLP codebase, the m_case_sensitive_search flag is used only for actual string value comparisons during query evaluation, not for schema key name matching. Schema keys are always compared case-sensitively regardless of this flag's setting.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:378-402
Timestamp: 2025-05-07T16:56:35.687Z
Learning: In the CLP search component, the evaluate_wildcard_filter function should return AstEvaluationResult::Pruned when node_id_value_pairs is empty, not AstEvaluationResult::False. Empty node sets should be treated as "undetermined" rather than definitive non-matches.
Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The QueryHandlerImpl class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: LinZhihao-723
PR: #558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file components/core/tests/test-ffi_KeyValuePairLogEvent.cpp, including <json/single_include/nlohmann/json.hpp> is consistent with the project's coding standards.
Learnt from: haiqi96
PR: #619
File: components/core/src/clp/clp/decompression.cpp:313-313
Timestamp: 2024-12-05T16:32:21.507Z
Learning: In the C++ FileDecompressor implementations at components/core/src/clp/clp/FileDecompressor.cpp and components/core/src/glt/glt/FileDecompressor.cpp, the temp_output_path variable and associated logic are used to handle multiple compressed files with the same name, and should be kept. This logic is separate from the temporary output directory code removed in PR #619 and is necessary for proper file handling.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the get_archive_node_id method, throwing a string literal as an exception is acceptable practice.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-07T21:38:35.979Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the get_archive_node_id method, throwing a string literal as an exception is acceptable practice.
Learnt from: gibber9809
PR: #630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:08.691Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the get_archive_node_id function, validation and exception throwing for UTF-8 compliance of curr_node.get_key_name() are unnecessary and should be omitted.
Learnt from: gibber9809
PR: #630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:13.322Z
Learning: In components/core/src/clp_s/JsonParser.cpp, validation and exception throwing are unnecessary in the get_archive_node_id method when processing nodes, and should not be added.
Learnt from: LinZhihao-723
PR: #856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.
Learnt from: LinZhihao-723
PR: #486
File: components/core/tests/test-error_handling.cpp:44-91
Timestamp: 2024-11-26T19:16:19.933Z
Learning: In components/core/tests/test-error_handling.cpp (C++), when specializing template methods of ErrorCategory via type aliases (e.g., AlwaysSuccessErrorCategory), it's acceptable to define these specializations in the current namespace, even though the primary template is in the clp::error_handling namespace, due to the use of type aliases and template instantiation.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In components/core/src/clp_s/JsonParser.cpp, when handling errors in parse_from_ir, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-clp_s-end_to_end.cpp:141-142
Timestamp: 2024-11-29T22:51:00.355Z
Learning: In 'components/core/tests/test-clp_s-end_to_end.cpp', using WIFEXITED and WEXITSTATUS with the return value of std::system is acceptable for checking the exit status of system commands, as per the C++ documentation.
Learnt from: LinZhihao-723
PR: #486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with std::array.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/clp-s.cpp:196-265
Timestamp: 2024-10-07T20:10:08.254Z
Learning: In clp-s.cpp, the run_serializer function interleaves serialization and writing of IR files, making it difficult to restructure it into separate functions.
components/core/src/clp/clg/clg.cpp (14)
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: #630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:13.322Z
Learning: In components/core/src/clp_s/JsonParser.cpp, validation and exception throwing are unnecessary in the get_archive_node_id method when processing nodes, and should not be added.
Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: jackluo923
PR: #1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like <cstdint> and <string> in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:148-157
Timestamp: 2025-05-02T22:27:59.347Z
Learning: In the QueryHandlerImpl.cpp file, the unique_projected_columns set (using std::string_view) is intentionally designed to only check for duplications within the local scope of the create_projected_columns_and_projection_map function. The team decided this is an acceptable use of std::string_view in a container since the referenced strings remain valid throughout the function's execution.
Learnt from: LinZhihao-723
PR: #856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.
Learnt from: LinZhihao-723
PR: #486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with std::array.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the get_archive_node_id method, throwing a string literal as an exception is acceptable practice.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-07T21:38:35.979Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the get_archive_node_id method, throwing a string literal as an exception is acceptable practice.
Learnt from: LinZhihao-723
PR: #486
File: components/core/tests/test-error_handling.cpp:44-91
Timestamp: 2024-11-26T19:16:19.933Z
Learning: In components/core/tests/test-error_handling.cpp (C++), when specializing template methods of ErrorCategory via type aliases (e.g., AlwaysSuccessErrorCategory), it's acceptable to define these specializations in the current namespace, even though the primary template is in the clp::error_handling namespace, due to the use of type aliases and template instantiation.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/clp-s.cpp:196-265
Timestamp: 2024-10-07T20:10:08.254Z
Learning: In clp-s.cpp, the run_serializer function interleaves serialization and writing of IR files, making it difficult to restructure it into separate functions.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/clp-s.cpp:196-265
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In clp-s.cpp, the run_serializer function interleaves serialization and writing of IR files, making it difficult to restructure it into separate functions.
Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In components/core/src/clp_s/JsonParser.cpp, when handling errors in parse_from_ir, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
components/core/src/clp/clg/CMakeLists.txt (8)
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: Bill-hbrhbr
PR: #1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: anlowee
PR: #925
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-19
Timestamp: 2025-05-26T18:39:51.727Z
Learning: In CMakeLists.txt files, anlowee prefers to explicitly list source files one by one rather than using file(GLOB ...) or file(GLOB_RECURSE ...) to automatically include files, maintaining consistency with other CMake files in the project.
Learnt from: jackluo923
PR: #1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like <cstdint> and <string> in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: Bill-hbrhbr
PR: #1126
File: .gitignore:5-5
Timestamp: 2025-07-25T21:29:48.947Z
Learning: In the CLP project, the .clang-format file is maintained in the yscope-dev-utils submodule and copied over to the main CLP repository, so it should be ignored in .gitignore to prevent accidental commits of the copied file and maintain the single source of truth in the submodule.
Learnt from: anlowee
PR: #925
File: taskfiles/deps/main.yaml:97-106
Timestamp: 2025-05-28T18:33:30.155Z
Learning: In the taskfiles dependency system (taskfiles/deps/main.yaml), echo commands are used to generate .cmake settings files that are consumed by the main CMake build process. These files set variables like ANTLR_RUNTIME_HEADER to point to dependency locations for use during compilation.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/test/utils.hpp:15-18
Timestamp: 2025-05-02T22:11:55.711Z
Learning: The team plans to systematically improve include paths by adding ${PROJECT_SOURCE_DIR}/src to CMake targets and using direct include paths instead of deep relative ones in a future PR.
components/core/src/clp/Grep.cpp (5)
Learnt from: gibber9809
PR: #955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The QueryHandlerImpl class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:164-166
Timestamp: 2025-05-11T22:22:49.286Z
Learning: For the QueryHandlerImpl class, reusing member variables like m_ast_dfs_stack is preferred over making methods const and moving state to function scope to avoid allocation/deallocation overhead, especially for frequently called methods like evaluate_kv_pair_log_event.
components/core/src/clp/GrepCore.cpp (3)
Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: haiqi96
PR: #619
File: components/core/src/clp/clp/decompression.cpp:313-313
Timestamp: 2024-12-05T16:32:21.507Z
Learning: In the C++ FileDecompressor implementations at components/core/src/clp/clp/FileDecompressor.cpp and components/core/src/glt/glt/FileDecompressor.cpp, the temp_output_path variable and associated logic are used to handle multiple compressed files with the same name, and should be kept. This logic is separate from the temporary output directory code removed in PR #619 and is necessary for proper file handling.
Learnt from: LinZhihao-723
PR: #856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.
components/core/src/clp/GrepCore.hpp (2)
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:164-166
Timestamp: 2025-05-11T22:22:49.286Z
Learning: For the QueryHandlerImpl class, reusing member variables like m_ast_dfs_stack is preferred over making methods const and moving state to function scope to avoid allocation/deallocation overhead, especially for frequently called methods like evaluate_kv_pair_log_event.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The QueryHandlerImpl class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
components/core/src/clp/Query.hpp (7)
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:164-166
Timestamp: 2025-05-11T22:22:49.286Z
Learning: For the QueryHandlerImpl class, reusing member variables like m_ast_dfs_stack is preferred over making methods const and moving state to function scope to avoid allocation/deallocation overhead, especially for frequently called methods like evaluate_kv_pair_log_event.
Learnt from: LinZhihao-723
PR: #558
File: components/core/src/clp/ffi/ir_stream/Deserializer.hpp:119-120
Timestamp: 2024-10-14T03:39:35.585Z
Learning: KeyValuePairLogEvent::create requires std::shared_ptr<SchemaTree>, so m_auto_generated_schema_tree and m_user_generated_schema_tree need to be std::shared_ptr.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:148-157
Timestamp: 2025-05-02T22:27:59.347Z
Learning: In the QueryHandlerImpl.cpp file, the unique_projected_columns set (using std::string_view) is intentionally designed to only check for duplications within the local scope of the create_projected_columns_and_projection_map function. The team decided this is an acceptable use of std::string_view in a container since the referenced strings remain valid throughout the function's execution.
Learnt from: gibber9809
PR: #584
File: components/core/src/clp_s/SchemaTree.hpp:91-94
Timestamp: 2024-11-12T18:47:03.828Z
Learning: In components/core/src/clp_s/SchemaTree.hpp, the SchemaNode class uses std::unique_ptr<char[]> m_key_buf and std::string_view m_key_name to ensure that references to m_key_name remain valid even after SchemaNode is move-constructed.
Learnt from: gibber9809
PR: #584
File: components/core/src/clp_s/SchemaTree.hpp:171-171
Timestamp: 2024-11-12T18:46:20.933Z
Learning: In components/core/src/clp_s/SchemaTree.hpp, it's acceptable to use std::string_view as keys in m_node_map because SchemaNode's m_key_name remains valid even after move operations or reallocations, preventing dangling references.
Learnt from: jackluo923
PR: #1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like <cstdint> and <string> in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The QueryHandlerImpl class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
components/core/src/clp/Query.cpp (4)
Learnt from: LinZhihao-723
PR: #882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:164-166
Timestamp: 2025-05-11T22:22:49.286Z
Learning: For the QueryHandlerImpl class, reusing member variables like m_ast_dfs_stack is preferred over making methods const and moving state to function scope to avoid allocation/deallocation overhead, especially for frequently called methods like evaluate_kv_pair_log_event.
Learnt from: LinZhihao-723
PR: #873
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:148-157
Timestamp: 2025-05-02T22:27:59.347Z
Learning: In the QueryHandlerImpl.cpp file, the unique_projected_columns set (using std::string_view) is intentionally designed to only check for duplications within the local scope of the create_projected_columns_and_projection_map function. The team decided this is an acceptable use of std::string_view in a container since the referenced strings remain valid throughout the function's execution.
Learnt from: gibber9809
PR: #584
File: components/core/src/clp_s/SchemaTree.hpp:171-171
Timestamp: 2024-11-12T18:46:20.933Z
Learning: In components/core/src/clp_s/SchemaTree.hpp, it's acceptable to use std::string_view as keys in m_node_map because SchemaNode's m_key_name remains valid even after move operations or reallocations, preventing dangling references.
Learnt from: gibber9809
PR: #584
File: components/core/src/clp_s/SchemaTree.hpp:91-94
Timestamp: 2024-11-12T18:47:03.828Z
Learning: In components/core/src/clp_s/SchemaTree.hpp, the SchemaNode class uses std::unique_ptr<char[]> m_key_buf and std::string_view m_key_name to ensure that references to m_key_name remain valid even after SchemaNode is move-constructed.
🧬 Code Graph Analysis (6)
components/core/src/clp/clg/clg.cpp (3)
components/core/src/clp/Utils.hpp (1)
load_lexer_from_file(52-55)components/core/src/clp/Query.hpp (3)
Query(167-171)get_segments_containing_log_dict_id(112-117)get_segments_containing_log_dict_id(219-224)components/core/src/clp/GrepCore.hpp (3)
process_raw_query(45-54)process_raw_query(148-265)process_raw_query(148-157)
components/core/src/clp/GrepCore.cpp (2)
components/core/src/clp/QueryToken.cpp (4)
is_wildcard(145-147)is_wildcard(145-145)is_var(135-143)is_var(135-135)components/core/src/clp/GrepCore.hpp (2)
value(65-70)value(82-88)
components/core/src/clp/GrepCore.hpp (3)
components/core/src/clp/QueryToken.cpp (4)
is_var(135-143)is_var(135-135)is_wildcard(145-147)is_wildcard(145-145)components/core/src/clp/GrepCore.cpp (4)
get_bounds_of_next_potential_var(33-151)get_bounds_of_next_potential_var(33-38)get_bounds_of_next_potential_var(153-272)get_bounds_of_next_potential_var(153-159)components/core/src/clp_s/search/clp_search/Grep.cpp (1)
clp_s(17-630)
components/core/src/clp/QueryToken.cpp (2)
components/core/src/clp/QueryToken.hpp (1)
QueryToken(13-13)components/core/src/clp_s/SchemaTree.hpp (1)
m_type(92-92)
components/core/src/clp/QueryToken.hpp (1)
components/core/src/clp/QueryToken.cpp (2)
is_var(135-143)is_var(135-135)
components/core/src/clp/Query.hpp (2)
components/core/src/clp_s/search/clp_search/Query.hpp (11)
precise_dict_var(82-85)possible_dict_vars(92-95)m_is_precise_var(39-39)m_is_dict_var(41-41)m_var_dict_entry(43-43)m_possible_var_dict_entries(45-48)m_wildcard_match_required(113-113)m_possible_logtype_ids(115-115)vars(141-141)m_vars(121-121)m_vars(123-123)components/core/src/glt/Query.hpp (15)
precise_dict_var(85-88)possible_dict_vars(95-98)segment_ids(40-40)m_is_precise_var(42-42)m_is_dict_var(44-44)m_var_dict_entry(46-46)m_possible_var_dict_entries(48-51)m_wildcard_match_required(116-116)m_wildcard_match_required(246-246)m_possible_logtype_ids(118-118)m_possible_logtype_ids(124-126)vars(148-148)vars(244-244)m_vars(128-128)m_vars(130-130)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: lint-check (macos-15)
🔇 Additional comments (15)
components/core/src/clp/clg/CMakeLists.txt (1)
36-37: Potential duplicate compilation units – verify thatGrepandGrepCorecoexist cleanly
Grep.cpp/Grep.hppremain in the source list while their functional successor (GrepCore.*) is now compiled into the same target. If any non-inline symbols were moved (rather than renamed) fromGrepintoGrepCore, you may get duplicate‐definition link errors or, worse, ODR violations that slip past the linker.Please confirm that
- the old class still provides distinct symbols, or
- the obsolete objects are no longer needed and can be removed from
CLG_SOURCES.Failing to do so can yield hard-to-trace runtime issues.
Also applies to: 64-65
components/core/src/clp/clo/CMakeLists.txt (1)
37-38: Consistency of include paths
QueryToken.*is added only underCLO_SOURCES, whereasGrepCore.*is added above. Keeping the new tokenisation logic co-located with its consumer makes sense, but ensure the public header is reachable through the target’starget_include_directories; otherwise downstream libraries (tests, utils) will need relative includes.Also applies to: 62-63
components/core/tests/test-Grep.cpp (1)
90-91: Mixed overload usage may hide test gapsOne call to
get_bounds_of_next_potential_varomits thelexerargument while all previous ones include it. If this overload behaves differently (e.g., falls back to a default wildcard policy) the test no longer exercises the lexer-aware branch for the final token; confirm this is intentional.components/core/src/clp/clo/clo.cpp (2)
4-4: LGTM! Required includes for the refactoring.The added includes support the new functionality:
<set>for segment ID collectionsstring_utilsfor wildcard string cleaningGrepCore.hppfor the refactored query processing logicAlso applies to: 10-10, 16-16
41-41: LGTM! Using statements support the refactored code.The added using statements properly import the new types and functions needed for the refactoring.
Also applies to: 44-44, 47-47, 51-51, 53-53
components/core/src/clp/clg/clg.cpp (2)
5-5: LGTM! Consistent with the refactoring pattern.The include and using statement additions mirror those in
clo.cppand properly support the refactored query processing.Also applies to: 9-9, 15-15, 31-31, 33-33, 36-36, 41-41, 43-43
504-507: Good encapsulation of wildcard addition logic.The lambda function cleanly encapsulates the wildcard wrapping and cleaning logic, improving code readability.
components/core/src/clp/Grep.cpp (1)
1-246: Clean extraction of query processing logic to GrepCore.The removal of query tokenization and processing logic from this file aligns perfectly with the PR objectives. The remaining search and output functionality is appropriately separated.
components/core/src/clp/EncodedVariableInterpreter.hpp (2)
196-196: Good simplification using dictionary IDs instead of pointers.The removal of the
VariableDictionaryEntryTypetemplate parameter and the switch to using dictionary IDs aligns well with the PR's decoupling objectives.Also applies to: 437-437, 468-469, 473-480
502-509: Consistent refactoring in wildcard search method.The changes properly track dictionary IDs for wildcard matches, maintaining consistency with the
encode_and_search_dictionarymethod.components/core/src/clp/GrepCore.cpp (1)
46-47: Correctly follows coding guidelines for boolean expressions.The code properly uses
false == <expression>pattern instead of!<expression>throughout the method.Also applies to: 58-58, 72-72, 144-144
components/core/src/clp/QueryToken.cpp (1)
91-169: Well-structured implementation of QueryToken methods.The getter and type-checking methods are clean and consistently handle ambiguous tokens. The change_to_next_possible_type method correctly cycles through possible types.
components/core/src/clp/GrepCore.hpp (1)
281-281: Correctly follows coding guidelines for boolean expressions.The code properly uses
false == <expression>pattern for negation checks.Also applies to: 288-288, 308-308
components/core/src/clp/Query.hpp (1)
21-21: Good refactoring from pointers to dictionary IDs.The consistent change from dictionary entry pointers to IDs improves decoupling and modularity. The addition of function parameters for external segment lookups is a clean design choice. Default member initializers enhance safety.
Also applies to: 24-24, 42-44, 50-50, 52-52, 58-59, 63-63, 67-67, 88-88, 97-97, 103-103, 113-117
components/core/src/clp/Query.cpp (1)
191-203: Clean implementation of segment ID calculation.The method correctly delegates to each sub-query's calculate_ids_of_matching_segments method, maintaining good separation of concerns.
| src/clp/GrepCore.cpp | ||
| src/clp/GrepCore.hpp | ||
| src/clp/hash_utils.cpp | ||
| src/clp/hash_utils.hpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Unit-test build lists growing unmanageably
The manual source lists have grown past 700 entries. Every new pair of files must be added in three places (clg, clo, and unit tests). Consider moving to:
target_sources(unitTest PRIVATE
$<TARGET_OBJECTS:clp_core_objects>)
or a file(GLOB ...) guarded behind an opt-in cache flag. This keeps explicit listing for production targets while sparing tests from constant churn.
Also applies to: 632-633
🤖 Prompt for AI Agents
In components/core/CMakeLists.txt around lines 582 to 585, the unit-test source
file lists are manually maintained and have grown excessively large, requiring
updates in multiple places. To fix this, replace the explicit file listings for
the unit tests with a target_sources call that references the object files from
the clp_core_objects target, or alternatively use a file(GLOB ...) command
controlled by an opt-in cache flag. This change will reduce maintenance overhead
by avoiding repeated manual updates while keeping production target listings
explicit. Apply the same approach to lines 632-633 as well.
| // TODO: creating a string reader, setting it equal to a string, to read it into | ||
| // the ParserInputBuffer, seems like a convoluted way to set a string equal to a | ||
| // string, should be improved when adding a SearchParser to log_surgeon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Address the TODO comment about improving string handling.
The comment indicates that the current approach of creating a StringReader and ParserInputBuffer is convoluted and should be improved when adding a SearchParser to log_surgeon.
Would you like me to create an issue to track this technical debt for future improvement?
🤖 Prompt for AI Agents
In components/core/src/clp/GrepCore.cpp around lines 244 to 246, the current
code uses a StringReader and ParserInputBuffer to set a string equal to another
string, which is unnecessarily complex. To fix this, simplify the string
handling by directly assigning or copying the string without involving these
intermediate objects. Since this is a known technical debt, create a clear issue
or task to track this improvement for when the SearchParser is added to
log_surgeon, and reference that issue in the TODO comment.
|
|
||
| namespace clp { | ||
| QueryToken::QueryToken( | ||
| std::string const& query_string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can use using std::string in this file since std::string is used multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
components/core/src/clp/GrepCore.cpp (2)
228-228: Fix copy-paste error in suffix wildcard check.Line 228 incorrectly uses
begin_posinstead ofend_pos - 1when checking for suffix wildcards.- bool has_suffix_wildcard = ('*' == value[end_pos - 1]) || ('?' == value[begin_pos]); + bool has_suffix_wildcard = ('*' == value[end_pos - 1]) || ('?' == value[end_pos - 1]);
244-246: Address the TODO comment about improving string handling.The comment indicates that the current approach of creating a StringReader and ParserInputBuffer is convoluted and should be improved when adding a SearchParser to log_surgeon.
Would you like me to create an issue to track this technical debt for future improvement?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp/GrepCore.cpp(1 hunks)components/core/src/clp/QueryToken.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp/GrepCore.cppcomponents/core/src/clp/QueryToken.cpp
🧠 Learnings (16)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/clp/FileCompressor.hpp:58-78
Timestamp: 2024-10-24T14:25:17.978Z
Learning: When reviewing legacy code refactors, avoid suggesting changes that would extend the scope of the PR.
📚 Learning: in `components/core/src/clp/bufferedfilereader.cpp`, refactoring the nested error handling condition...
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Applied to files:
components/core/src/clp/GrepCore.cppcomponents/core/src/clp/QueryToken.cpp
📚 Learning: in the c++ `filedecompressor` implementations at `components/core/src/clp/clp/filedecompressor.cpp` ...
Learnt from: haiqi96
PR: y-scope/clp#619
File: components/core/src/clp/clp/decompression.cpp:313-313
Timestamp: 2024-12-05T16:32:21.507Z
Learning: In the C++ `FileDecompressor` implementations at `components/core/src/clp/clp/FileDecompressor.cpp` and `components/core/src/glt/glt/FileDecompressor.cpp`, the `temp_output_path` variable and associated logic are used to handle multiple compressed files with the same name, and should be kept. This logic is separate from the temporary output directory code removed in PR #619 and is necessary for proper file handling.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: in the clp search component, the `evaluate_wildcard_filter` function should return `astevaluationres...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:378-402
Timestamp: 2025-05-07T16:56:35.687Z
Learning: In the CLP search component, the `evaluate_wildcard_filter` function should return `AstEvaluationResult::Pruned` when `node_id_value_pairs` is empty, not `AstEvaluationResult::False`. Empty node sets should be treated as "undetermined" rather than definitive non-matches.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, when handling errors in `parse_from_ir`, prefer to ma...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, when handling errors in `parse_from_ir`, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, within the `get_archive_node_id` function, validation...
Learnt from: gibber9809
PR: y-scope/clp#630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:08.691Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `get_archive_node_id` function, validation and exception throwing for UTF-8 compliance of `curr_node.get_key_name()` are unnecessary and should be omitted.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: double parentheses in the codebase are intentional and required for clang-tidy to pass. these should...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-03-18T07:27:54.738Z
Learning: Double parentheses in the codebase are intentional and required for clang-tidy to pass. These should not be flagged as style issues in code reviews.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: in the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp...
Learnt from: LinZhihao-723
PR: y-scope/clp#557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp`, suppressing the `readability-function-cognitive-complexity` warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: in the `network_reader_with_valid_http_header_kv_pairs` test case in `components/core/tests/test-net...
Learnt from: LinZhihao-723
PR: y-scope/clp#593
File: components/core/tests/test-NetworkReader.cpp:216-219
Timestamp: 2024-11-15T03:15:45.919Z
Learning: In the `network_reader_with_valid_http_header_kv_pairs` test case in `components/core/tests/test-NetworkReader.cpp`, additional error handling for JSON parsing failures is not necessary, as the current error message is considered sufficient.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: user haiqi96 requested creating a github issue to document a bug fix from pr #1136, which addressed ...
Learnt from: haiqi96
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-07-29T14:04:13.769Z
Learning: User haiqi96 requested creating a GitHub issue to document a bug fix from PR #1136, which addressed MySQL compatibility issues with invalid SQL CAST operations in the WebUI component.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: in the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible f...
Learnt from: LinZhihao-723
PR: y-scope/clp#856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.
Applied to files:
components/core/src/clp/GrepCore.cpp
📚 Learning: when working with constexpr strings (string literals with static storage duration), std::string_view...
Learnt from: davemarco
PR: y-scope/clp#700
File: components/core/src/clp/streaming_archive/ArchiveMetadata.hpp:153-155
Timestamp: 2025-01-30T19:26:33.869Z
Learning: When working with constexpr strings (string literals with static storage duration), std::string_view is the preferred choice for member variables as it's more efficient and safe, avoiding unnecessary memory allocations.
Applied to files:
components/core/src/clp/QueryToken.cpp
📚 Learning: in the file `components/core/tests/test-ffi_keyvaluepairlogevent.cpp`, including `
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
Applied to files:
components/core/src/clp/QueryToken.cpp
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
components/core/src/clp/QueryToken.cpp📚 Learning: member variables in c++ classes should be explicitly initialized in the constructor to prevent undef...
Learnt from: haiqi96
PR: y-scope/clp#646
File: components/core/src/clp/streaming_archive/writer/Archive.hpp:354-354
Timestamp: 2025-01-14T16:06:54.692Z
Learning: Member variables in C++ classes should be explicitly initialized in the constructor to prevent undefined behavior, as demonstrated in the Archive class where `m_use_single_file_archive` is initialized to `false`.
Applied to files:
components/core/src/clp/QueryToken.cpp
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, within the `get_archive_node_id` method, throwing a s...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-07T21:38:35.979Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `get_archive_node_id` method, throwing a string literal as an exception is acceptable practice.
Applied to files:
components/core/src/clp/QueryToken.cpp
📚 Learning: only trivially destructible global objects are allowed in the coding standard, so avoid suggesting t...
Learnt from: LinZhihao-723
PR: y-scope/clp#544
File: components/core/src/clp/ffi/ir_stream/ir_unit_deserialization_methods.cpp:170-186
Timestamp: 2024-09-24T22:33:59.488Z
Learning: Only trivially destructible global objects are allowed in the coding standard, so avoid suggesting the use of non-trivially destructible global objects like `static std::unordered_map`.
Applied to files:
components/core/src/clp/QueryToken.cpp
🧬 Code Graph Analysis (1)
components/core/src/clp/QueryToken.cpp (2)
components/core/src/clp/QueryToken.hpp (1)
QueryToken(13-13)components/core/src/clp_s/SchemaTree.hpp (1)
m_type(92-92)
🔇 Additional comments (4)
components/core/src/clp/GrepCore.cpp (2)
1-20: Well-organized includes and declarations.The includes are properly grouped and the using declarations improve code readability.
33-151: Solid implementation of variable boundary detection.The method correctly handles escape characters, wildcards, and special cases like hex values and equals-sign prefixed tokens. The logic flow is clear and maintains proper state throughout.
components/core/src/clp/QueryToken.cpp (2)
1-9: Clean and minimal includes.The headers are appropriate and the
using std::stringdeclaration improves readability as suggested.
93-171: Well-structured getter methods with proper const-correctness.The getter methods are appropriately const-qualified and handle ambiguous token types consistently. The
change_to_next_possible_typemethod correctly implements circular iteration through possible types.
| class SearchToken : public log_surgeon::Token { | ||
| public: | ||
| std::set<int> m_type_ids_set; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider encapsulating the public member variable.
While acceptable for a simple wrapper in an anonymous namespace, consider making m_type_ids_set private with appropriate accessors for better encapsulation.
🤖 Prompt for AI Agents
In components/core/src/clp/GrepCore.cpp around lines 27 to 30, the member
variable m_type_ids_set in the SearchToken class is public, which reduces
encapsulation. Change m_type_ids_set to private and add public getter and setter
methods or other appropriate accessors to control access to this set, improving
encapsulation and maintaining class invariants.
| QueryToken::QueryToken( | ||
| string const& query_string, | ||
| size_t const begin_pos, | ||
| size_t const end_pos, | ||
| bool const is_var | ||
| ) | ||
| : m_current_possible_type_ix(0) { | ||
| m_begin_pos = begin_pos; | ||
| m_end_pos = end_pos; | ||
| m_value.assign(query_string, m_begin_pos, m_end_pos - m_begin_pos); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider using member initializer list for all member variables.
While m_current_possible_type_ix is initialized in the member initializer list, m_begin_pos and m_end_pos could also be initialized there for consistency and potential performance benefits.
- : m_current_possible_type_ix(0) {
- m_begin_pos = begin_pos;
- m_end_pos = end_pos;
+ : m_current_possible_type_ix(0),
+ m_begin_pos(begin_pos),
+ m_end_pos(end_pos) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| QueryToken::QueryToken( | |
| string const& query_string, | |
| size_t const begin_pos, | |
| size_t const end_pos, | |
| bool const is_var | |
| ) | |
| : m_current_possible_type_ix(0) { | |
| m_begin_pos = begin_pos; | |
| m_end_pos = end_pos; | |
| m_value.assign(query_string, m_begin_pos, m_end_pos - m_begin_pos); | |
| QueryToken::QueryToken( | |
| string const& query_string, | |
| size_t const begin_pos, | |
| size_t const end_pos, | |
| bool const is_var | |
| ) | |
| : m_current_possible_type_ix(0), | |
| m_begin_pos(begin_pos), | |
| m_end_pos(end_pos) { | |
| m_value.assign(query_string, m_begin_pos, m_end_pos - m_begin_pos); | |
| // ... rest of constructor ... | |
| } |
🤖 Prompt for AI Agents
In components/core/src/clp/QueryToken.cpp around lines 11 to 21, the constructor
initializes m_begin_pos and m_end_pos inside the constructor body instead of the
member initializer list. Move the initialization of m_begin_pos and m_end_pos to
the member initializer list along with m_current_possible_type_ix for
consistency and better performance.
components/core/src/clp/Query.hpp
Outdated
| encoded_variable_t m_precise_var{}; | ||
| // Only used if the precise variable is a dictionary variable | ||
| VariableDictionaryEntry const* m_var_dict_entry; | ||
| variable_dictionary_id_t m_var_dict_entry{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we don't store pointer to entries anymore, how about renaming them tom something like m_var_dict_id
|
I feel that moving segment index calculation outside the procress_raw_query is not the best design, especially know that developer needs to explicitly call calculate_ids_of_matching_segments, but I see it is because we want to refactor only the common logic into GrepCore. In addition, I believe that our long-term goal is to drop segment support in CLP as well, so l would say the change proposed in this PR is fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
components/core/src/clp/Query.cpp(6 hunks)components/core/src/clp/Query.hpp(7 hunks)taskfiles/lint.yaml(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp/Query.cppcomponents/core/src/clp/Query.hpp
🧠 Learnings (24)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/clp/FileCompressor.hpp:58-78
Timestamp: 2024-10-24T14:25:17.978Z
Learning: When reviewing legacy code refactors, avoid suggesting changes that would extend the scope of the PR.
Learnt from: gibber9809
PR: y-scope/clp#504
File: components/core/src/clp_s/search/kql/CMakeLists.txt:29-29
Timestamp: 2024-10-22T15:36:04.655Z
Learning: When reviewing pull requests, focus on the changes within the PR and avoid commenting on issues outside the scope of the PR.
📚 Learning: in the clp project, when reviewing cmakelists.txt changes that introduce new compression library dep...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in the clp project, the .clang-format file is maintained in the yscope-dev-utils submodule and copie...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1126
File: .gitignore:5-5
Timestamp: 2025-07-25T21:29:48.947Z
Learning: In the CLP project, the .clang-format file is maintained in the yscope-dev-utils submodule and copied over to the main CLP repository, so it should be ignored in .gitignore to prevent accidental commits of the copied file and maintain the single source of truth in the submodule.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in the clp project, antlr code generation at build time is being removed by another pr. when reviewi...
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in cmakelists.txt files, anlowee prefers to explicitly list source files one by one rather than usin...
Learnt from: anlowee
PR: y-scope/clp#925
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-19
Timestamp: 2025-05-26T18:39:51.727Z
Learning: In CMakeLists.txt files, anlowee prefers to explicitly list source files one by one rather than using file(GLOB ...) or file(GLOB_RECURSE ...) to automatically include files, maintaining consistency with other CMake files in the project.
Applied to files:
taskfiles/lint.yaml
📚 Learning: antlr-generated files under `antlr_generated/` directories should not be subject to coding style che...
Learnt from: anlowee
PR: y-scope/clp#925
File: components/core/src/clp_s/search/kql/antlr_generated/KqlParser.cpp:0-0
Timestamp: 2025-05-27T20:06:21.263Z
Learning: ANTLR-generated files under `antlr_generated/` directories should not be subject to coding style checks or manual review comments, as they are auto-generated code that would be overwritten on regeneration.
Applied to files:
taskfiles/lint.yaml
📚 Learning: the team plans to systematically improve include paths by adding `${project_source_dir}/src` to cmak...
Learnt from: LinZhihao-723
PR: y-scope/clp#873
File: components/core/src/clp/ffi/ir_stream/search/test/utils.hpp:15-18
Timestamp: 2025-05-02T22:11:55.711Z
Learning: The team plans to systematically improve include paths by adding `${PROJECT_SOURCE_DIR}/src` to CMake targets and using direct include paths instead of deep relative ones in a future PR.
Applied to files:
taskfiles/lint.yaml
📚 Learning: double parentheses in the codebase are intentional and required for clang-tidy to pass. these should...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-03-18T07:27:54.738Z
Learning: Double parentheses in the codebase are intentional and required for clang-tidy to pass. These should not be flagged as style issues in code reviews.
Applied to files:
taskfiles/lint.yaml
📚 Learning: double parentheses in require statements (e.g., `require((expression))`) and in conditional expressi...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-03-18T07:27:54.738Z
Learning: Double parentheses in REQUIRE statements (e.g., `REQUIRE((expression))`) and in conditional expressions are intentional and required for clang-tidy to pass. These should not be flagged as style issues in code reviews.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in clp installation scripts, consistency across platform scripts is prioritized over defensive progr...
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in the clp search component, the `evaluate_wildcard_filter` function should return `astevaluationres...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:378-402
Timestamp: 2025-05-07T16:56:35.687Z
Learning: In the CLP search component, the `evaluate_wildcard_filter` function should return `AstEvaluationResult::Pruned` when `node_id_value_pairs` is empty, not `AstEvaluationResult::False`. Empty node sets should be treated as "undetermined" rather than definitive non-matches.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in the clp codebase, standard headers like `` and `` in alphabetical order (as seen...
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Applied to files:
taskfiles/lint.yamlcomponents/core/src/clp/Query.hpp
📚 Learning: in 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floa...
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in `components/core/src/clp/bufferedfilereader.cpp`, refactoring the nested error handling condition...
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, when handling errors in `parse_from_ir`, prefer to ma...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, when handling errors in `parse_from_ir`, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in `clp-s.cpp`, the `run_serializer` function interleaves serialization and writing of ir files, mak...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/clp-s.cpp:196-265
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In `clp-s.cpp`, the `run_serializer` function interleaves serialization and writing of IR files, making it difficult to restructure it into separate functions.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in the clp project, c++20 is used, allowing the utilization of c++20 features such as class template...
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with `std::array`.
Applied to files:
taskfiles/lint.yaml
📚 Learning: for the `queryhandlerimpl` class, reusing member variables like `m_ast_dfs_stack` is preferred over ...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:164-166
Timestamp: 2025-05-11T22:22:49.286Z
Learning: For the `QueryHandlerImpl` class, reusing member variables like `m_ast_dfs_stack` is preferred over making methods `const` and moving state to function scope to avoid allocation/deallocation overhead, especially for frequently called methods like `evaluate_kv_pair_log_event`.
Applied to files:
components/core/src/clp/Query.cppcomponents/core/src/clp/Query.hpp
📚 Learning: `keyvaluepairlogevent::create` requires `std::shared_ptr`, so `m_auto_generated_schema_t...
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/src/clp/ffi/ir_stream/Deserializer.hpp:119-120
Timestamp: 2024-10-14T03:39:35.585Z
Learning: `KeyValuePairLogEvent::create` requires `std::shared_ptr<SchemaTree>`, so `m_auto_generated_schema_tree` and `m_user_generated_schema_tree` need to be `std::shared_ptr`.
Applied to files:
components/core/src/clp/Query.cppcomponents/core/src/clp/Query.hpp
📚 Learning: in the `queryhandlerimpl.cpp` file, the `unique_projected_columns` set (using `std::string_view`) is...
Learnt from: LinZhihao-723
PR: y-scope/clp#873
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:148-157
Timestamp: 2025-05-02T22:27:59.347Z
Learning: In the `QueryHandlerImpl.cpp` file, the `unique_projected_columns` set (using `std::string_view`) is intentionally designed to only check for duplications within the local scope of the `create_projected_columns_and_projection_map` function. The team decided this is an acceptable use of `std::string_view` in a container since the referenced strings remain valid throughout the function's execution.
Applied to files:
components/core/src/clp/Query.cppcomponents/core/src/clp/Query.hpp
📚 Learning: in `components/core/src/clp_s/schematree.hpp`, it's acceptable to use `std::string_view` as keys in ...
Learnt from: gibber9809
PR: y-scope/clp#584
File: components/core/src/clp_s/SchemaTree.hpp:171-171
Timestamp: 2024-11-12T18:46:20.933Z
Learning: In `components/core/src/clp_s/SchemaTree.hpp`, it's acceptable to use `std::string_view` as keys in `m_node_map` because `SchemaNode`'s `m_key_name` remains valid even after move operations or reallocations, preventing dangling references.
Applied to files:
components/core/src/clp/Query.cppcomponents/core/src/clp/Query.hpp
📚 Learning: in `components/core/src/clp_s/schematree.hpp`, the `schemanode` class uses `std::unique_ptr ...
Learnt from: gibber9809
PR: y-scope/clp#584
File: components/core/src/clp_s/SchemaTree.hpp:91-94
Timestamp: 2024-11-12T18:47:03.828Z
Learning: In `components/core/src/clp_s/SchemaTree.hpp`, the `SchemaNode` class uses `std::unique_ptr<char[]> m_key_buf` and `std::string_view m_key_name` to ensure that references to `m_key_name` remain valid even after `SchemaNode` is move-constructed.
Applied to files:
components/core/src/clp/Query.cppcomponents/core/src/clp/Query.hpp
📚 Learning: the variable `ir_node_to_archive_node_unordered_map` is intentionally named to reflect its type chan...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-07T21:38:35.979Z
Learning: The variable `ir_node_to_archive_node_unordered_map` is intentionally named to reflect its type change, even if the actual variable type has been altered.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: the `queryhandlerimpl` class in clp/ffi/ir_stream/search is not designed to be thread-safe and is as...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The `QueryHandlerImpl` class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
Applied to files:
components/core/src/clp/Query.hpp
🧬 Code Graph Analysis (1)
components/core/src/clp/Query.hpp (2)
components/core/src/clp_s/search/clp_search/Query.hpp (9)
precise_dict_var(82-85)possible_dict_vars(92-95)m_is_precise_var(39-39)m_is_dict_var(41-41)m_wildcard_match_required(113-113)m_possible_logtype_ids(115-115)vars(141-141)m_vars(121-121)m_vars(123-123)components/core/src/glt/Query.hpp (13)
precise_dict_var(85-88)possible_dict_vars(95-98)segment_ids(40-40)m_is_precise_var(42-42)m_is_dict_var(44-44)m_wildcard_match_required(116-116)m_wildcard_match_required(246-246)m_possible_logtype_ids(118-118)m_possible_logtype_ids(124-126)vars(148-148)vars(244-244)m_vars(128-128)m_vars(130-130)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: build (macos-15)
- GitHub Check: build-macos (macos-15, true)
🔇 Additional comments (4)
taskfiles/lint.yaml (2)
285-287: Addition keeps the new GrepCore files in step with existing Grep exclusions
clp/GrepCore.cppandclp/GrepCore.hppare now explicitly listed alongside the legacyGrep.*files. This mirrors the current policy of silencing clang-tidy on the heavy query-parsing code paths and resolves the earlier omission.
329-331: QueryToken exclusions added for consistency
clp/QueryToken.cppandclp/QueryToken.hppfollow the same exclusion pattern as other search-token classes. No further action required.components/core/src/clp/Query.cpp (2)
189-201: Well-structured addition for explicit segment calculation.The new method properly delegates segment calculation to sub-queries, aligning with the PR's goal of making segment index calculation explicit rather than implicit in query processing.
87-87: Add explicit type to the range-based for loop variable.The loop variable
entryis missing its type specification, which could lead to compilation issues or reduced code clarity.- for (auto entry : m_possible_var_dict_ids) { + for (auto const& entry : m_possible_var_dict_ids) {⛔ Skipped due to learnings
Learnt from: LinZhihao-723 PR: y-scope/clp#873 File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:148-157 Timestamp: 2025-05-02T22:27:59.347Z Learning: In the `QueryHandlerImpl.cpp` file, the `unique_projected_columns` set (using `std::string_view`) is intentionally designed to only check for duplications within the local scope of the `create_projected_columns_and_projection_map` function. The team decided this is an acceptable use of `std::string_view` in a container since the referenced strings remain valid throughout the function's execution.Learnt from: LinZhihao-723 PR: y-scope/clp#882 File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:164-166 Timestamp: 2025-05-11T22:22:49.286Z Learning: For the `QueryHandlerImpl` class, reusing member variables like `m_ast_dfs_stack` is preferred over making methods `const` and moving state to function scope to avoid allocation/deallocation overhead, especially for frequently called methods like `evaluate_kv_pair_log_event`.Learnt from: haiqi96 PR: y-scope/clp#523 File: components/core/src/clp/BufferedFileReader.cpp:96-106 Timestamp: 2024-10-24T14:45:26.265Z Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we rename this to be test-GrepCore.cpp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
taskfiles/lint.yaml (1)
285-287: New sources still exempted from clang-tidy
GrepCore.*andQueryToken.*are now part of the core search logic but are added toEXCLUDE_PATTERNS, meaning they will never be statically analysed. The earlier review already raised this for the same files; the concern still stands.Also applies to: 329-331
components/core/CMakeLists.txt (1)
582-584: Manual unit-test source list keeps growingYet another trio of files (
GrepCore.*,QueryToken.*, andtest-GrepCore.cpp) is appended to the already 700-line list. This reproduces the maintenance burden highlighted in previous reviews; consider thetarget_sources(unitTest PRIVATE $<TARGET_OBJECTS:clp_core_objects>)or opt-infile(GLOB …)approach suggested earlier.Also applies to: 632-633, 723-723
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
components/core/CMakeLists.txt(3 hunks)components/core/tests/test-GrepCore.cpp(3 hunks)taskfiles/lint.yaml(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/tests/test-GrepCore.cpp
🧠 Learnings (27)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/clp/FileCompressor.hpp:58-78
Timestamp: 2024-10-24T14:25:17.978Z
Learning: When reviewing legacy code refactors, avoid suggesting changes that would extend the scope of the PR.
Learnt from: gibber9809
PR: y-scope/clp#504
File: components/core/src/clp_s/search/kql/CMakeLists.txt:29-29
Timestamp: 2024-10-22T15:36:04.655Z
Learning: When reviewing pull requests, focus on the changes within the PR and avoid commenting on issues outside the scope of the PR.
📚 Learning: in the clp project, when reviewing cmakelists.txt changes that introduce new compression library dep...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txt
📚 Learning: in the clp project, the .clang-format file is maintained in the yscope-dev-utils submodule and copie...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1126
File: .gitignore:5-5
Timestamp: 2025-07-25T21:29:48.947Z
Learning: In the CLP project, the .clang-format file is maintained in the yscope-dev-utils submodule and copied over to the main CLP repository, so it should be ignored in .gitignore to prevent accidental commits of the copied file and maintain the single source of truth in the submodule.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in cmakelists.txt files, anlowee prefers to explicitly list source files one by one rather than usin...
Learnt from: anlowee
PR: y-scope/clp#925
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-19
Timestamp: 2025-05-26T18:39:51.727Z
Learning: In CMakeLists.txt files, anlowee prefers to explicitly list source files one by one rather than using file(GLOB ...) or file(GLOB_RECURSE ...) to automatically include files, maintaining consistency with other CMake files in the project.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txt
📚 Learning: in the clp project, antlr code generation at build time is being removed by another pr. when reviewi...
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txt
📚 Learning: antlr-generated files under `antlr_generated/` directories should not be subject to coding style che...
Learnt from: anlowee
PR: y-scope/clp#925
File: components/core/src/clp_s/search/kql/antlr_generated/KqlParser.cpp:0-0
Timestamp: 2025-05-27T20:06:21.263Z
Learning: ANTLR-generated files under `antlr_generated/` directories should not be subject to coding style checks or manual review comments, as they are auto-generated code that would be overwritten on regeneration.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floa...
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txtcomponents/core/tests/test-GrepCore.cpp
📚 Learning: double parentheses in the codebase are intentional and required for clang-tidy to pass. these should...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-03-18T07:27:54.738Z
Learning: Double parentheses in the codebase are intentional and required for clang-tidy to pass. These should not be flagged as style issues in code reviews.
Applied to files:
taskfiles/lint.yaml
📚 Learning: double parentheses in require statements (e.g., `require((expression))`) and in conditional expressi...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-03-18T07:27:54.738Z
Learning: Double parentheses in REQUIRE statements (e.g., `REQUIRE((expression))`) and in conditional expressions are intentional and required for clang-tidy to pass. These should not be flagged as style issues in code reviews.
Applied to files:
taskfiles/lint.yamlcomponents/core/tests/test-GrepCore.cpp
📚 Learning: in clp installation scripts, consistency across platform scripts is prioritized over defensive progr...
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txt
📚 Learning: in the clp search component, the `evaluate_wildcard_filter` function should return `astevaluationres...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:378-402
Timestamp: 2025-05-07T16:56:35.687Z
Learning: In the CLP search component, the `evaluate_wildcard_filter` function should return `AstEvaluationResult::Pruned` when `node_id_value_pairs` is empty, not `AstEvaluationResult::False`. Empty node sets should be treated as "undetermined" rather than definitive non-matches.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in the clp codebase, standard headers like `` and `` in alphabetical order (as seen...
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txt
📚 Learning: in `components/core/src/clp/bufferedfilereader.cpp`, refactoring the nested error handling condition...
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txtcomponents/core/tests/test-GrepCore.cpp
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, when handling errors in `parse_from_ir`, prefer to ma...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, when handling errors in `parse_from_ir`, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
Applied to files:
taskfiles/lint.yamlcomponents/core/tests/test-GrepCore.cpp
📚 Learning: in `clp-s.cpp`, the `run_serializer` function interleaves serialization and writing of ir files, mak...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/clp-s.cpp:196-265
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In `clp-s.cpp`, the `run_serializer` function interleaves serialization and writing of IR files, making it difficult to restructure it into separate functions.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txt
📚 Learning: in the clp project, c++20 is used, allowing the utilization of c++20 features such as class template...
Learnt from: LinZhihao-723
PR: y-scope/clp#486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with `std::array`.
Applied to files:
taskfiles/lint.yaml
📚 Learning: in the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msg...
Learnt from: LinZhihao-723
PR: y-scope/clp#570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txtcomponents/core/tests/test-GrepCore.cpp
📚 Learning: in the file `components/core/tests/test-ffi_keyvaluepairlogevent.cpp`, including `
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
Applied to files:
taskfiles/lint.yaml
components/core/CMakeLists.txt
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
taskfiles/lint.yamlcomponents/core/CMakeLists.txt📚 Learning: in the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp...
Learnt from: LinZhihao-723
PR: y-scope/clp#557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp`, suppressing the `readability-function-cognitive-complexity` warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Applied to files:
taskfiles/lint.yamlcomponents/core/CMakeLists.txtcomponents/core/tests/test-GrepCore.cpp
📚 Learning: the team plans to systematically improve include paths by adding `${project_source_dir}/src` to cmak...
Learnt from: LinZhihao-723
PR: y-scope/clp#873
File: components/core/src/clp/ffi/ir_stream/search/test/utils.hpp:15-18
Timestamp: 2025-05-02T22:11:55.711Z
Learning: The team plans to systematically improve include paths by adding `${PROJECT_SOURCE_DIR}/src` to CMake targets and using direct include paths instead of deep relative ones in a future PR.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: when reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note th...
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: in clp installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistenc...
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: the `queryhandlerimpl` class in clp/ffi/ir_stream/search is not designed to be thread-safe and is as...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The `QueryHandlerImpl` class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: in `components/core/tests/test-ffi_keyvaluepairlogevent.cpp`, helper functions like `assert_kv_pair_...
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:228-236
Timestamp: 2024-10-14T03:43:40.364Z
Learning: In `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, helper functions like `assert_kv_pair_log_event_creation_failure` return booleans to the caller, and the caller asserts using `REQUIRE` to provide clearer failure information.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: the function `assert_kv_pair_log_event_creation_failure` is correctly placed within the anonymous na...
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:85-103
Timestamp: 2024-10-14T03:42:53.145Z
Learning: The function `assert_kv_pair_log_event_creation_failure` is correctly placed within the anonymous namespace in `test-ffi_KeyValuePairLogEvent.cpp`.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: in the `network_reader_with_valid_http_header_kv_pairs` test case in `components/core/tests/test-net...
Learnt from: LinZhihao-723
PR: y-scope/clp#593
File: components/core/tests/test-NetworkReader.cpp:216-219
Timestamp: 2024-11-15T03:15:45.919Z
Learning: In the `network_reader_with_valid_http_header_kv_pairs` test case in `components/core/tests/test-NetworkReader.cpp`, additional error handling for JSON parsing failures is not necessary, as the current error message is considered sufficient.
Applied to files:
components/core/CMakeLists.txt
📚 Learning: in `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verifi...
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verified through `REQUIRE` statements and subsequent comparisons.
Applied to files:
components/core/tests/test-GrepCore.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: lint-check (macos-15)
🔇 Additional comments (2)
components/core/tests/test-GrepCore.cpp (1)
90-92: Verify the overload without alexerargumentAll preceding calls supply the
lexerparameter, but this one omits it:GrepCore::get_bounds_of_next_potential_var(str, begin_pos, end_pos, is_var)Confirm that
GrepCorereally provides an overload with the signature(std::string const&, size_t&, size_t&, bool&). If not, the test won’t compile.taskfiles/lint.yaml (1)
735-735: Test file exclusion is okayExcluding the Catch2 test file from clang-tidy is consistent with the rest of the configuration.
haiqi96
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did my best effort review and the change looks good as most of them copying and paste with adding template parameters.
Co-authored-by: kirkrodrigues <[email protected]>
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
components/core/src/clp/Query.hpp (3)
50-50: Consider renaming member variables for clarity.Based on the previous review comment, since we don't store pointers to entries anymore, consider renaming
m_var_dict_idtom_var_dict_id(which is already done) and ensure consistency throughout the codebase.Also applies to: 63-63
146-147: Improve template parameter documentation.The documentation should be more descriptive about the requirements for the template parameter.
- * @tparam EncodedVariableContainerType A random access list of `clp::encoded_variable_t`. + * @tparam EncodedVariableContainerType A random access container of `clp::encoded_variable_t`.
126-126: Consider renaming for consistency.As suggested in the previous review, rename
m_possible_logtype_idstom_possible_logtypesto maintain consistency.components/core/src/clp/GrepCore.hpp (1)
29-30: Use consistent documentation style.Change "NOTE:" to maintain consistency with other documentation markers like "TODO:".
- * NOTE: Callers are responsible for ensuring that the search string is "clean", where clean is + * NOTE: Callers are responsible for ensuring that the search string is "clean", where clean is
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp/GrepCore.hpp(1 hunks)components/core/src/clp/Query.hpp(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp/GrepCore.hppcomponents/core/src/clp/Query.hpp
🧠 Learnings (13)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/clp/FileCompressor.hpp:58-78
Timestamp: 2024-10-24T14:25:17.978Z
Learning: When reviewing legacy code refactors, avoid suggesting changes that would extend the scope of the PR.
Learnt from: gibber9809
PR: y-scope/clp#504
File: components/core/src/clp_s/search/kql/CMakeLists.txt:29-29
Timestamp: 2024-10-22T15:36:04.655Z
Learning: When reviewing pull requests, focus on the changes within the PR and avoid commenting on issues outside the scope of the PR.
📚 Learning: for the `queryhandlerimpl` class, reusing member variables like `m_ast_dfs_stack` is preferred over ...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp:164-166
Timestamp: 2025-05-11T22:22:49.286Z
Learning: For the `QueryHandlerImpl` class, reusing member variables like `m_ast_dfs_stack` is preferred over making methods `const` and moving state to function scope to avoid allocation/deallocation overhead, especially for frequently called methods like `evaluate_kv_pair_log_event`.
Applied to files:
components/core/src/clp/GrepCore.hppcomponents/core/src/clp/Query.hpp
📚 Learning: in `components/core/src/clp_s/jsonparser.cpp`, within the `parse_from_ir` method, encountering error...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:735-794
Timestamp: 2024-10-07T21:35:04.362Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `parse_from_ir` method, encountering errors from `kv_log_event_result.error()` aside from `std::errc::no_message_available` and `std::errc::result_out_of_range` is anticipated behavior and does not require additional error handling or logging.
Applied to files:
components/core/src/clp/GrepCore.hpp
📚 Learning: the `queryhandlerimpl` class in clp/ffi/ir_stream/search is not designed to be thread-safe and is as...
Learnt from: LinZhihao-723
PR: y-scope/clp#882
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:428-450
Timestamp: 2025-05-07T16:56:53.901Z
Learning: The `QueryHandlerImpl` class in clp/ffi/ir_stream/search is not designed to be thread-safe and is assumed to not be used across multiple threads.
Applied to files:
components/core/src/clp/GrepCore.hppcomponents/core/src/clp/Query.hpp
📚 Learning: the variable `ir_node_to_archive_node_unordered_map` is intentionally named to reflect its type chan...
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:581-627
Timestamp: 2024-10-07T21:38:35.979Z
Learning: The variable `ir_node_to_archive_node_unordered_map` is intentionally named to reflect its type change, even if the actual variable type has been altered.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: in `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serial...
Learnt from: LinZhihao-723
PR: y-scope/clp#557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serialize_schema_tree_node_id` assumes that the caller checks that `node_id` fits within the range of `encoded_node_id_t` before casting.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: in `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verifi...
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verified through `REQUIRE` statements and subsequent comparisons.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: in the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp...
Learnt from: LinZhihao-723
PR: y-scope/clp#557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp`, suppressing the `readability-function-cognitive-complexity` warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: double parentheses in require statements (e.g., `require((expression))`) and in conditional expressi...
Learnt from: Bill-hbrhbr
PR: y-scope/clp#0
File: :0-0
Timestamp: 2025-03-18T07:27:54.738Z
Learning: Double parentheses in REQUIRE statements (e.g., `REQUIRE((expression))`) and in conditional expressions are intentional and required for clang-tidy to pass. These should not be flagged as style issues in code reviews.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: in the `queryhandlerimpl.cpp` file, the `unique_projected_columns` set (using `std::string_view`) is...
Learnt from: LinZhihao-723
PR: y-scope/clp#873
File: components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp:148-157
Timestamp: 2025-05-02T22:27:59.347Z
Learning: In the `QueryHandlerImpl.cpp` file, the `unique_projected_columns` set (using `std::string_view`) is intentionally designed to only check for duplications within the local scope of the `create_projected_columns_and_projection_map` function. The team decided this is an acceptable use of `std::string_view` in a container since the referenced strings remain valid throughout the function's execution.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: in `components/core/src/clp_s/schematree.hpp`, the `schemanode` class uses `std::unique_ptr ...
Learnt from: gibber9809
PR: y-scope/clp#584
File: components/core/src/clp_s/SchemaTree.hpp:91-94
Timestamp: 2024-11-12T18:47:03.828Z
Learning: In `components/core/src/clp_s/SchemaTree.hpp`, the `SchemaNode` class uses `std::unique_ptr<char[]> m_key_buf` and `std::string_view m_key_name` to ensure that references to `m_key_name` remain valid even after `SchemaNode` is move-constructed.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: in `components/core/src/clp_s/schematree.hpp`, it's acceptable to use `std::string_view` as keys in ...
Learnt from: gibber9809
PR: y-scope/clp#584
File: components/core/src/clp_s/SchemaTree.hpp:171-171
Timestamp: 2024-11-12T18:46:20.933Z
Learning: In `components/core/src/clp_s/SchemaTree.hpp`, it's acceptable to use `std::string_view` as keys in `m_node_map` because `SchemaNode`'s `m_key_name` remains valid even after move operations or reallocations, preventing dangling references.
Applied to files:
components/core/src/clp/Query.hpp
📚 Learning: in the clp codebase, standard headers like `` and `` in alphabetical order (as seen...
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/src/glt/SQLitePreparedStatement.hpp:4-4
Timestamp: 2025-07-01T14:49:07.333Z
Learning: In the CLP codebase, standard headers like `<cstdint>` and `<string>` in alphabetical order (as seen in components/core/src/glt/SQLitePreparedStatement.hpp) are correctly ordered and do not need adjustment.
Applied to files:
components/core/src/clp/Query.hpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: lint-check (macos-15)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-15, true)
🔇 Additional comments (7)
components/core/src/clp/Query.hpp (4)
4-4: LGTM!The addition of
<functional>header is necessary for the newstd::functionparameters in the refactored methods.
40-44: LGTM! Well-designed interface change.The use of
std::functionto retrieve segments containing dictionary IDs is a clean abstraction that decouples the Query logic from the dictionary implementation details.
213-225: LGTM! Clean separation of concerns.The new
calculate_ids_of_matching_segmentsmethod properly separates the segment calculation from query processing, allowing callers to opt-in to this computation as needed.
240-264: LGTM! Correct subsequence matching implementation.The templated
matches_varsmethod correctly implements subsequence matching, allowing variables to appear in order but not necessarily contiguously. The algorithm is efficient with O(n) complexity.components/core/src/clp/GrepCore.hpp (3)
1-21: LGTM! Well-organized header structure.The includes are properly organized and all necessary dependencies are included for the GrepCore functionality.
342-356: LGTM! Thorough escape handling implementation.The escape handler lambda correctly handles the complex cases of escaping wildcards and variable placeholders. The logic properly avoids double-escaping already escaped wildcards.
408-418: LGTM! Efficient logtype ID collection.The code properly converts logtype entries to their IDs for storage in the sub-query. This aligns well with the overall refactoring goal of using IDs instead of pointers.
| std::replace( | ||
| search_string_for_sub_queries.begin(), | ||
| search_string_for_sub_queries.end(), | ||
| '?', | ||
| '*' | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider edge case handling for wildcard replacement.
The replacement of '?' with '*' wildcards and subsequent cleanup could potentially create unexpected patterns. For example, "a?b" becomes "a**b" then "ab", which changes the semantics slightly. Consider documenting this behaviour or handling it more explicitly.
🤖 Prompt for AI Agents
In components/core/src/clp/GrepCore.hpp around lines 168 to 173, the code
replaces '?' with '*' in search strings, which can lead to unintended patterns
like consecutive '*' characters altering the original semantics. To fix this,
explicitly handle or document the behavior of such replacements, for example by
collapsing multiple '*' into a single '*' after replacement or by adding
comments explaining this transformation and its impact on pattern matching.
kirkrodrigues
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the body of the PR going to be? (Usually I put that as the first lines of the PR body (before the description).
I updated the PR description with what I'd put as the PR body @kirkrodrigues. |
|
Edited the PR body directly (you can check the edit history for details). |
Grep::process_raw_queryand associated utilities into a newGrepCoreclass.GrepCore.GrepCore::process_raw_queryflow.GrepCore::process_raw_queryexplicitly specify all wildcards in the query string.GrepCore::process_raw_queryclean up the query string.Description
This PR is the second in a series of PRs (see #1138 for the first) that is attempting to bring the core CLP parsing and search code to a state where it can be shared by both clp and clp-s. The full end to end integration can be seen here.
This PR focuses on splitting up
Grepinto the parts that will be shared (GrepCore) and the parts specific to CLP (left asGrep), and making the parts that will be shared independent of specific CLP implementations. The diff size is large since code is moved around, but the actual amount of changes is fairly small. In particular this PR:Moves
process_raw_queryand associated utility functions toGrepCoreModifies functions in
GrepCoreto use templates in order to remove dependencies on clp specific dictionary implementations -- this will be augmented with concepts in a follow-up PR in order to guarantee that the dictionaries follow a certain interface.Moves the
QueryTokenclass into its own filesQueryToken.{cpp,hpp}Refactors
QueryandSubqueryclasses to work in terms of dictionary ids (and segment ids) instead of pointers to dictionary entries in order to remove dependence on clp-specific dictionary implementations.Refactors
process_raw_queryflow in a few waysprocess_raw_query-- now callers that are interested in finding matches based on the segment index must callcalculate_ids_of_matching_segmentson the resultingQueryobject fromprocess_raw_query. This simplifies theprocess_raw_queryflow and allows callers that don't care about the segment index (clp-s) to not have to worry about the segment index calculation flow.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Refactor
Bug Fixes
Chores