Skip to content

Conversation

@gibber9809
Copy link
Contributor

@gibber9809 gibber9809 commented Jun 17, 2025

Description

This PR adds support for delta-encoding integers and uses that support to delta-encode the log_event_idx column. This leads to a significant improvement in compression ratio when log event ordering is enabled for some datasets.

The following table shows the compression ratio improvements for the open source datasets, with the compression ratio without log ordering as a baseline:

dataset old compression ratio delta-encoding compression ratio disable-log-order compression ratio % improvement disable-log-order over delta-encoding
cockroach 27.120 28.312 28.361 0.17%
mongodb 122.284 170.949 190.778 11.60%
elasticsearch 125.428 152.630 159.006 4.18%
spark 56.084 57.448 57.832 0.67%
postgresql 36.523 39.124 40.564 3.68%

Compression speed and ordered decompression speed seem to not significantly change compared to when not using delta-encoded log ordering.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Benchmarked compression ratio improvements, compression speed differences, and ordered decompression speed differences
  • Manually validated decompression order for small dataset
  • Added test to check that delta-encoded log order column works as expected for forwards and backwards seeks on that column

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features
    • Added support for delta-encoded integer columns, enabling more efficient storage and retrieval of log event index data.
  • Bug Fixes
    • Improved compatibility for log event index handling by broadening accepted column types.
  • Tests
    • Introduced new tests to validate delta encoding and log event index ordering.
    • Added a simple test log file for validation purposes.
  • Chores
    • Updated archive format patch version to 2.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jun 17, 2025

## Walkthrough

Support for a new `DeltaInteger` node type was added across the CLP-S archive system. This includes new delta-encoded integer column reader and writer classes, schema and archive logic updates, and test coverage. The archive format patch version was incremented. A new test and input file validate delta encoding and log event index behaviour.

## Changes

| File(s)                                                                                      | Change Summary                                                                                              |
|---------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|
| .../core/src/clp_s/SchemaTree.hpp<br>.../core/src/clp_s/SchemaTree.cpp                      | Added `DeltaInteger` to `NodeType` enum and handled its mapping to literal type.                            |
| .../core/src/clp_s/SingleFileArchiveDefs.hpp                                                | Incremented archive patch version constant from 1 to 2.                                                     |
| .../core/src/clp_s/ColumnReader.hpp<br>.../core/src/clp_s/ColumnReader.cpp                  | Added `DeltaEncodedInt64ColumnReader` class for reading delta-encoded int64 columns.                        |
| .../core/src/clp_s/ColumnWriter.hpp<br>.../core/src/clp_s/ColumnWriter.cpp                  | Added `DeltaEncodedInt64ColumnWriter` class for writing delta-encoded int64 columns.                        |
| .../core/src/clp_s/ArchiveReader.cpp                                                        | Added handling for `DeltaInteger` in column reader creation and simplified log event index marking.          |
| .../core/src/clp_s/ArchiveWriter.cpp                                                        | Added handling for `DeltaInteger` in schema writer initialization.                                          |
| .../core/src/clp_s/SchemaReader.cpp                                                         | Supported `DeltaInteger` in timestamp extraction and JSON serialization logic.                              |
| .../core/src/clp_s/SchemaReader.hpp                                                         | Broadened log event index column type from `Int64ColumnReader*` to `BaseColumnReader*`.                     |
| .../core/src/clp_s/JsonParser.cpp                                                           | Changed `log_event_idx_node_id` metadata type from `Integer` to `DeltaInteger`.                             |
| .../core/CMakeLists.txt                                                                     | Included new test source file for delta encoding in unit tests.                                             |
| .../core/tests/test-clp_s-delta-encode-log-order.cpp                                        | Added test for delta encoding and log event index, including custom filter and value checks.                |
| .../core/tests/test_log_files/test_simple_order.jsonl                                       | Added a simple ordered JSONL file for testing delta encoding and log event index.                           |

## Sequence Diagram(s)

```mermaid
sequenceDiagram
    participant Test as Test Case
    participant ArchiveWriter as ArchiveWriter
    participant ArchiveReader as ArchiveReader
    participant SchemaReader as SchemaReader
    participant DeltaWriter as DeltaEncodedInt64ColumnWriter
    participant DeltaReader as DeltaEncodedInt64ColumnReader

    Test->>ArchiveWriter: Compress JSONL with DeltaInteger
    ArchiveWriter->>DeltaWriter: Write delta-encoded int64 values
    ArchiveWriter-->>Test: Archive created

    Test->>ArchiveReader: Open archive
    ArchiveReader->>SchemaReader: Read schema, identify DeltaInteger column
    SchemaReader->>DeltaReader: Provide delta-encoded int64 column reader

    Test->>DeltaReader: Seek and extract values at various indices
    DeltaReader-->>Test: Return decoded int64 values

Possibly related PRs

Suggested reviewers

  • wraymo
  • kirkrodrigues

<!-- walkthrough_end -->


---

<details>
<summary>📜 Recent review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: ASSERTIVE**
**Plan: Pro**


<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 2823fa2f65602509e096aa7562c215e878375b02 and 9444ff8feec5398abd31e79f75b4d10fafe09a3b.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `components/core/tests/test-clp_s-delta-encode-log-order.cpp` (1 hunks)

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>📓 Path-based instructions (1)</summary>

<details>
<summary>`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == <expression>` rather than `!<expression>`.</summary>


> `**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == <expression>` rather than `!<expression>`.

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:
- `components/core/tests/test-clp_s-delta-encode-log-order.cpp`

</details>

</details><details>
<summary>🧠 Learnings (2)</summary>

<details>
<summary>📓 Common learnings</summary>

Learnt from: davemarco
PR: #698
File: components/core/src/clp/streaming_archive/Constants.hpp:9-9
Timestamp: 2025-01-28T03:02:30.542Z
Learning: In the CLP project's archive version format (semver), the patch version uses 16 bits (uint16_t), while major version uses 8 bits and minor version uses 16 bits.


</details>
<details>
<summary>components/core/tests/test-clp_s-delta-encode-log-order.cpp (24)</summary>

Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.


Learnt from: AVMatthews
PR: #595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In components/core/tests/test-clp_s-end_to_end.cpp, the success of constructor.store() is verified through REQUIRE statements and subsequent comparisons.


Learnt from: LinZhihao-723
PR: #558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file components/core/tests/test-ffi_KeyValuePairLogEvent.cpp, including <json/single_include/nlohmann/json.hpp> is consistent with the project's coding standards.


Learnt from: LinZhihao-723
PR: #570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (components/core/tests/test-ir_encoding_methods.cpp), exception handling for msgpack::unpack can be omitted because the Catch2 testing framework captures exceptions if they occur.


Learnt from: LinZhihao-723
PR: #557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case ffi_ir_stream_serialize_schema_tree_node_id in test-ir_encoding_methods.cpp, suppressing the readability-function-cognitive-complexity warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.


Learnt from: Bill-hbrhbr
PR: #614
File: components/core/tests/test-StreamingCompression.cpp:45-54
Timestamp: 2024-12-04T15:50:37.827Z
Learning: In components/core/tests/test-StreamingCompression.cpp, within the compress function, compressing the same data repeatedly by passing the same src pointer without advancing is intentional to test the compressor with the same data multiple times.


Learnt from: haiqi96
PR: #619
File: components/core/src/clp/clp/decompression.cpp:313-313
Timestamp: 2024-12-05T16:32:21.507Z
Learning: In the C++ FileDecompressor implementations at components/core/src/clp/clp/FileDecompressor.cpp and components/core/src/glt/glt/FileDecompressor.cpp, the temp_output_path variable and associated logic are used to handle multiple compressed files with the same name, and should be kept. This logic is separate from the temporary output directory code removed in PR #619 and is necessary for proper file handling.


Learnt from: LinZhihao-723
PR: #558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:85-103
Timestamp: 2024-10-14T03:42:53.145Z
Learning: The function assert_kv_pair_log_event_creation_failure is correctly placed within the anonymous namespace in test-ffi_KeyValuePairLogEvent.cpp.


Learnt from: gibber9809
PR: #630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:13.322Z
Learning: In components/core/src/clp_s/JsonParser.cpp, validation and exception throwing are unnecessary in the get_archive_node_id method when processing nodes, and should not be added.


Learnt from: gibber9809
PR: #630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:08.691Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the get_archive_node_id function, validation and exception throwing for UTF-8 compliance of curr_node.get_key_name() are unnecessary and should be omitted.


Learnt from: LinZhihao-723
PR: #557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In components/core/src/clp/ffi/ir_stream/utils.hpp, the function size_dependent_encode_and_serialize_schema_tree_node_id assumes that the caller checks that node_id fits within the range of encoded_node_id_t before casting.


Learnt from: LinZhihao-723
PR: #593
File: components/core/tests/test-NetworkReader.cpp:216-219
Timestamp: 2024-11-15T03:15:45.919Z
Learning: In the network_reader_with_valid_http_header_kv_pairs test case in components/core/tests/test-NetworkReader.cpp, additional error handling for JSON parsing failures is not necessary, as the current error message is considered sufficient.


Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In components/core/src/clp/BufferedFileReader.cpp, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.


Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In components/core/src/clp_s/JsonParser.cpp, when handling errors in parse_from_ir, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.


Learnt from: gibber9809
PR: #584
File: components/core/src/clp_s/SchemaTree.hpp:91-94
Timestamp: 2024-11-12T18:47:03.828Z
Learning: In components/core/src/clp_s/SchemaTree.hpp, the SchemaNode class uses std::unique_ptr<char[]> m_key_buf and std::string_view m_key_name to ensure that references to m_key_name remain valid even after SchemaNode is move-constructed.


Learnt from: gibber9809
PR: #584
File: components/core/src/clp_s/SchemaTree.hpp:171-171
Timestamp: 2024-11-12T18:46:20.933Z
Learning: In components/core/src/clp_s/SchemaTree.hpp, it's acceptable to use std::string_view as keys in m_node_map because SchemaNode's m_key_name remains valid even after move operations or reallocations, preventing dangling references.


Learnt from: gibber9809
PR: #584
File: components/core/src/clp_s/SchemaTree.hpp:40-55
Timestamp: 2024-11-12T18:56:31.067Z
Learning: In components/core/src/clp_s/SchemaTree.hpp, within the SchemaNode class, the use of std::string_view for m_key_name referencing m_key_buf is intentional to ensure that references to the key name remain valid even after move construction.


Learnt from: LinZhihao-723
PR: #554
File: components/core/src/clp/ffi/KeyValuePairLogEvent.cpp:299-307
Timestamp: 2024-10-10T05:46:35.188Z
Learning: In the C++ function get_schema_subtree_bitmap in KeyValuePairLogEvent.cpp, when a loop uses while (true) with an internal check on optional.has_value(), and comments explain that this structure is used to silence clang-tidy warnings about unchecked optional access, this code is acceptable and should not be refactored to use while (optional.has_value()).


Learnt from: haiqi96
PR: #523
File: components/core/src/clp/BufferedFileReader.hpp:158-158
Timestamp: 2024-10-24T14:46:00.664Z
Learning: In components/core/src/clp/BufferedFileReader.hpp, the BufferedFileReader::try_read function explicitly calls BufferReader::try_read, so documentation should reference BufferReader::try_read.


Learnt from: LinZhihao-723
PR: #856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.


Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:735-794
Timestamp: 2024-10-07T21:35:04.362Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the parse_from_ir method, encountering errors from kv_log_event_result.error() aside from std::errc::no_message_available and std::errc::result_out_of_range is anticipated behavior and does not require additional error handling or logging.


Learnt from: LinZhihao-723
PR: #554
File: components/core/src/clp/ffi/KeyValuePairLogEvent.cpp:109-111
Timestamp: 2024-10-10T15:19:52.408Z
Learning: In KeyValuePairLogEvent.cpp, for the class JsonSerializationIterator, it's acceptable to use raw pointers for member variables like m_schema_tree_node, and there's no need to replace them with references or smart pointers in this use case.


Learnt from: LinZhihao-723
PR: #554
File: components/core/src/clp/ffi/KeyValuePairLogEvent.cpp:102-102
Timestamp: 2024-10-10T15:21:14.506Z
Learning: In KeyValuePairLogEvent.cpp, the get_next_child_schema_tree_node() method in JsonSerializationIterator is always called after checking has_next_child_schema_tree_node(), ensuring proper iterator usage.


Learnt from: AVMatthews
PR: #543
File: components/core/src/clp_s/JsonParser.cpp:756-765
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In components/core/src/clp_s/JsonParser.cpp, within the parse_from_ir() function, reaching the end of log events in a given IR is not considered an error case. The errors std::errc::no_message_available and std::errc::result_out_of_range are expected signals to break the deserialization loop and proceed accordingly.


</details>

</details>

</details>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms (10)</summary>

* GitHub Check: build-macos (macos-13, true)
* GitHub Check: build-macos (macos-14, true)
* GitHub Check: ubuntu-jammy-lint
* GitHub Check: centos-stream-9-dynamic-linked-bins
* GitHub Check: ubuntu-jammy-static-linked-bins
* GitHub Check: centos-stream-9-static-linked-bins
* GitHub Check: ubuntu-jammy-dynamic-linked-bins
* GitHub Check: lint-check (ubuntu-latest)
* GitHub Check: lint-check (macos-latest)
* GitHub Check: build (macos-latest)

</details>

<details>
<summary>🔇 Additional comments (1)</summary><blockquote>

<details>
<summary>components/core/tests/test-clp_s-delta-encode-log-order.cpp (1)</summary>

`63-118`: **The test implementation correctly validates delta encoding functionality.**

The test logic is sound and properly validates the delta-encoded column reader:
- Uses parameterized testing with different start indices
- Correctly compresses and reads back archive data
- Properly identifies and casts the delta-encoded column reader
- Validates forward and backward seeks work correctly

The core functionality is well-tested and the implementation looks correct.

</details>

</blockquote></details>

</details>
<!-- internal state start -->


<!-- DwQgtGAEAqAWCWBnSTIEMB26CuAXA9mAOYCmGJATmriQCaQDG+Ats2bgFyQAOFk+AIwBWJBrngA3EsgEBPRvlqU0AgfFwA6NPEgQAfACgjoCEYDEZyAAUASpETZWaCrKNwSPbABsvkCiQBHbGlcSHFcLzpIACIAMxJqAAoGL24wRABKLgBBWnolL1w0MDImWngMIntsbm58ClDY+pQMGlI+Ji9HDEQAbkgAVUQPdUgmvlxYDy98IgB9EilWufhaAA8FLuYMaJbcCkVsBmlIAqKSjDKKqoda+sbmirbKTe7kCrCpxlT0ngORMQaGAIZDxajYfwoZCIbiieCxeAMNA+eRoWpeeBRAifaazBZLXArdavbZhfAoZi8fBSBSU/yIRDwfBYKjifDIADuUywMyqi3Y/AoSgo1yhkDIKkitA0RgAQqVYMxnABrPzSby4ZDM+z8qi+fCwrCIfAQ46nahoYaa04kZjMxD7agjOnU23sd5YJh06SM7WspnIL3cZxY8mTDy8RZM7DINiTRSQDnqWAm0JnNDiy6Ka5A9xquoNKJB+m+lnUAPoSGWsb4Hz4DnIRICS0kDHkMXFn1MrBJ+N4SC8wXC65ZIxQJgMZUHNAMWBcWIHZiQABMAHYNABGZcABjJK4AHBoAMxbyCJdMXK6VDIAGnQGHoy8PR4AbBuz+VEJKSGBeWB6sKGQGFAdqVIoAjzoukBbsuGhPgALHuG6rtuGgAJzwWhH6tucpTZted6YPQG5oahq6rvuH5IN+v6zP+QqUEBUCtpa4gMMMzizpBLDQcuACsGjwU+SF8bBL5Hru544cUeHlAR97EXxaEaNu24vlRX4CJEtFEPRgHAfYwYUMq3FLnxL4qfuiHYnx67wfBlFSYUMlZnJRC3gpkC2Ro+5HsuGk0X+AGMQZdQOkQ9IBF4pmQK+GiiUee5HspW6IU5uGuSOhEPpA8GoeZaWfoFdHBRQQEGHmsIUMcrRoKQFJUlIbCtPwsSnNRWmioOpX8FIfAXrJopUJUJwLjxqHIQApEhG4aC+27TUohpufwWDhuaRRWjKBgAMIsJGDLdoZJBRERQ6UFESidod2owid9CICmHKQBg5KMkQGDwoimChLOmCkIG+0hvQ2LrZGEjRu8lKRM1RRsj2yapjazmZleRDbQAasiqzltqFQpNgSj0AICpKsZorXaWfi4w1BxNe6d6U0dd10Nl9ClZdoj7SWzOwnQ/RKhg2DIpAEjY7QuNYPgbVXdzXbaj12oZogSo+BtLa4P0Z1oHkoo0A6yDYn18LyEwFD+GIkAk7AaAQ800s4sjGVlFEvL4uwRIbJ03Q1nwAj4JMvscs49Bnc2k7B0KOokMqiDbZV3i+P4QQhPY5aIAiJxMK0IoCHgR1ENgqythU0h3qgr2hBmAj+GgyoUzbI1s6c+AnJXapBPAkK0PgDCOOwkuQDUEv69tFiQHtrDutUTguG4XzMl4pssLDq2O9wicd8EDqJtWY1LjGLwclQsh2neXJvdgRAAzQof2BiRCwKEdrlAiSLw3u622GE6iRL73zOHUKiHK4wlS4HEJUXMXwqRhSxL/DwwdEAcCMHEBIuBkg/EyDkPITsXJo2qHcBo/8ngkHaCSHo/QhgjAeHwAABm7fkyxVhrFoeQjQ0R56oAcNfEIR1sYfUDDMYYS9EzJkdiQNYSAIFVHCH/bW8BmCG3JAoxqHha4SzUBiXA8g5DoF1pUK2M5lRsVjpWE0OV1re1JBgNAbAoEXUTA416/BwwdBXtPZon4+43R6GqM2RM15g03snbemgxxgEMAYEwzEcoOzQHgQgpByCsiLO41oXBeD8GEKIcQUgZDL2FCoNQmhtC6AiVE8AUA4CoFQJgHABBiBkGULfWksMuBUGeg4WeOiCnKFUOoLQOh9BGAqaYAwQZmTugAPRmxIFMnaABZOuJAAAyUi464DWJwAw0RdkGHHtkAAko05JToHqODJvIB2f0RqIHnh4bAn1Qj61CMaU0HgER/wxDvWhABlAA8gMGwO0ACicwABihyVkgt+XMR56hoAhFYYgoe3AR6hhaATJQ6AXokGei8+wJpqofPgH/WhLzEBTJeWAFI3A5iIDAANVyP4goMQoBoBgtRaFQNQDc+qJCIpOkNl8cg+LU6fJGK0MMXwJFSNFPC55qcJGiDwN+MIzhSChFoQqxFDpkWI37C/E2opMBXNcVbIuXh6DZwRIXf09o7xLTIEoS4mJEB3maEUCgmq/gGkoOIaQ21zCWGyIUZp3YlGOyul4ZwkstRtQkQWFpzQN6dQYJmcIbqjAADlJkyl2Rw4CYyJnkFaJS2ZUzEDVRmakelUzsjVQQFIGwCRhQctqMggt+yQ3HKSc0qIXTLmtUYI3AG9yR3/RODrc5hCaE4tFeQl6igPDaNhHeWhAARaShzWikMoLQu8uj8ZdBWkiYYkaGC1xoJATd0kQXMtoDu3AL54J7S2BgFtOt90tAdJgY4HorYB1gDetEy05jqMoHMKxGBWFnVoaB51cLXpsroOB1tkHoOIFYXGFMtA47AlQBImgD5hUeGg2qL9HQr1HV5IiPcjdaB/0mBXPFS7sWro8MiZkRBGTYtlQ6PWshYTQiOMB6stCn17ooLBnKtDwUzGoNyowO7Ha0IqOoeA2MABeJB6WzltGgNDlHsMkHjLQO8ljmTlHhiLSY1BIBk1MRmcj1Z1qDkYaECoSgNgosZNDL6dAgRWH8HbGMS9y6/SmJOKI8JHbkcORuhz1B9P0FocwOYDCCSeyg7WboRIZP0Fi5Z99FHhS72QHUiov7LgeAdhJ1oL633dE/cKVhEMMwZloLIGxzA6NntCCTcYHhHOigGcCDwC6XZqjtHkqN3XbF9dYiO0QqozrlAthEeQjnSOLtc18dzBIWjeZQG1YrPsEtJdwPp91aoiAh0iAyYdZ3SQQYoAAckBpcfw16OPbVzS4qYHRR0nGxNnfYtYxgzA5HeSgBw+AMbbEQD1fBaPpo5A4pUSgx4hrDfa3xoMvjRtjfDeN4o1hJqiCm7AaaM0aekGOSA/2+Ug+lR4d6NjcAQhOA7RN9xKd8FTRidN7A6ecgx1+/Neyi1gCMCW6ZFaq0MBrXSylDbZySBIAAdRFDQdlnLuCdr2Qc3tTSUnnO6cO5ndyKpfGZ3ovDBCk3/3Wre5ykn2isNeuxoTkrVNq6bVrnXlAOAcHU+IbTunED6aVHMY+6hv04cUECX5sIGBfSRCic+3IUC6/LAY10fAo9TCVGxsuJ353Lo+2EH3YpXdFHd/uizXxE/0Fes9BDJGcV17QPel2j6Guvty9sbX8fpOLrj7r53NtQhe48BdgnbPo8Zgn5QHlyBJstjFFV/1MWp7lCdCItAsRJ/rX49IyACPRQOxdxup0vz9jXE98u6vsIgT/aAy8VHMPzbNEv5UZHCgOcEOsQUOYoR+8QYgUQuizGFWus8M2OkAoaueJOn8hOogMaeOpOvOhY7MAu1OQutOAa1uUATOwOka607O4I9Iw62ByaeBNOIuRBjikImOJAkuhaEAMu4y+0kyZaMy9QcyiuyudaTW2wLWq++uhuHCxuJy/a5uQ61yZBRg2QuKz0KQloyA3evey6/ez6g+764hY+NSQBtARwUQvYwGLekaCOHgTKfeewL66s9iN6CmtAJmZmpWFWhkdSDsTwjh6YosyI28nkYemmGIOmO2fc5sAoXmEinkSIKWgRXQHyUE60CIFAO8rYboLUsWpqDmPodUJwZ+zhtCmqcwYsyRcw1Ans7huGfi9o+wRw1o60KgxoXQ16FRwQ6AVckARAGuWAsRGwR6X22RRQKI9uV+he1OjoYgooARnRo0qRtuM4Uw9ACxIRD4Eid4w8eeVQJCFANivgv616M4/i1wS8JRRGVAYg5RQRJAtRCY32EI+OaB9h6x9m60ISIQUQbADIhRR2cRwxEg+A9cBipRpmtxlR1RzC3K42N6VxM4hIDoIolQkJwQKwUqcwecsQ8QY+LegBfUzRrxOhSRXRHxXwfRSwAJGw2IysD+Bi2stQzq7wzyrOvqEMAS2JuJ9iwwExKBjUxcuCNo+CsQjysxzI2M2ixCA+5C+G/2Z+FMMaDI3OfAVhlYw22Ymc0oIy3aiBuOcaqBth6BxOEaNB5OfOuBngDBrQouDOCypmdR06dAXAtCwJqwkAW6zk2hRMT6jWQ+H66GFAIerhiQso2AOJlAGMmIHIhhAAZKVpQNsX4fBHMDPo4HML8V+ADBkLQgZPaR4U6bQC6cmamR6Xeg+r6foc1oGSHmUQsVUYSMwokIyDpqWcwjmXmQ6QmIWS6Q6EWRwGLCKD9MACWbgI6iaFpCQHeH2SHsidcEma0PuKmXoGWV6RWQPqIQGZRiHgiTcQsYkEXAPqWVERmQUaQB2VAPmY6XkM6Tem6fQJ6UUN6XQJWZuYYTuVstcUifSfMPWU8PgFieGbiQeaOYwBCKeX8aQNObgP2XOZUPGVyYxLmV2tLrLjwaWpqPwf4JWtWrSiIf6YYRoLAB2jskbj2rIWbjPAoW1FbsoaoQAo9loeuXoW+YGawrUjeeZi0IDhpgYvvDerKC2KxcZkCIcp5tDKMTtiEl3FEBDA0MLL4FYS6a4QevCZ+YiWifcc3LQrud+Sib+XcRiQQIBRGdJiUWURxg8YVlqH1CKDeWtOSE8QcTermkoNAD7iHo+WgA3mZZAJJgcciOFjnklk8NoL4nUrQgMIcfAB9HQPacwCnpgCOUebgHoJ7rYlEGlppVhvRrWPkNJJeCSQsTdlxgYhYWEByOSLwJIE6PkcwCTBkS6elieQsawokBFaObQhkJ5FlSeTCWeBmLQi2bprgF1UCCodVWLNelMKkC8PieCYSPWdCesOxcgETv4CDOSEGHgCul8IiQpaSZxj0ZSWQNSVbPIIOnMdJKCEsdMEtqcT6DFpsWsM4eoY9gKUoF4dnMiU0c0A9dwOfpVrumQhdtrDaLEPEoUDaD9WIPUAgUgeGvaIacKRgQaTzuaTgYKFaQQYwVmgZDtEqbAUTC6V5c+boX6QYWxS0DenLnwQrrhbWpSsJW2sRdwPqoHGkbWFDqKILnRkpQYJALoGpdwELqMKTcxRTdWZRokE8H5G2bQB2YLVALQgAH7i196vkEWBmJDdUF52X3EC1C2un4DukhlhmmVRl4pxkJkUALl6GllCzpaZmFGZC9Sw7Fy5lK0uXLruWwi9EQkcY61u3mwe2G3K0zkDmALDmjnjn4FTn2AwWzk/l21LmpVk4zGLV3EgUpVgUUAQVZkkC622Wh1e3G3um6X0o/maVGUAWIUUDZ322/TgXO1QUJ2wU/kIVAWMTB362e2QCG1nSTW1VsD1WUBIJh03qgV1mGXLVrDNnwCtmebrCK1G1RX8LkC0DxWJUYDJWN0rnpbFV93K2gXNXgWtUT1DUL0jUOZQbgUwmkUcHFroXy4CE4VK54VM3+kj667toG4P26lHIUVnJUXOBXI0VKEGAqGTaE03rq06Ga3vrf3fo2wyAnQDEmFmH0DlW4CVX5FmZIIgZ5CaUFY3oOgCGwl5jwZEOtV4N1FkPUFpFdw7ygXrHrY5KH4gLNAOACDDApwtTFURYJ0CE7YBHnjwi4mlCF1Wymbo6nWWIQj+AtSD0haQwOEpmhACOoojx6y24KMCjvEMBnGVAXF+WhBOUvFs5X3DpA2N3/wJCzj26yV3ElH0P3G0MJgr47auP5XOSFUBIsN3HkG25ywMhcMKKkrOAiLYgf5uIlYr4ZHw36koEL4o0mlI3o0U6Wm83C42lEF2ldmhycW9lX2llwM+kblf1B5BkcDTqaWJBWDODDCb1nkkAh6DmaaTmpnxkLEXmQBXndlFN3km0Pnlka0VOINVNJ0CGJAABafZk8B0xoFA8ZlM9QHZKFnBaFlIvBmF9N79jN8ylTo+RFJFXaMhfalFg6oDluEDUDrGb1mhZTL54z3QSDY+KKhZ5cGAPF5+/FtCglwwm5bzsJYl4NpcX1DRFAv1fA/1gN6DzwfAoNOUnWJAENGo0NjRsN7KcJDzPdxchsuDclnOIsap/F60zYvJDzLptTrVWeiIwGRQyoU6aoEjNWe4g1DTGRcVLTbTUdnTo1nk5jXhw12lrjqlXI9L6AhjJAANXhtCczMFCzJY9QrC/grLxwolu6AVmewVSooVVWOKQ9HROS9sbUamKVbWgTL0GVqWB9gTJDHVFrdVDVN6p9RhTyYRV9m1kAOmBwb+5IMTA4swdGzQap6OVYnFWNRqWpiTyBppKT61aNCaGNdB2NdGuN9O+NMDPZsDoz8DLzw+VT7FWAtCtNuzr9QhH9hzEzxzrN7Nc4E9e0PQmLBAQZwtotoQTz5NVZhbo+MtrQctS9K9UAW6MNrbLpateb5TLFRzuuQdetgpAAvODZDaNRPf03hsgoLaXcNaWTS1nZy00/aZBa05HUOfy903cUXe7UoH3aXfeUI/4LM/MyE0sys6+2s3i7exPcFjVdeiPQ1ePdu0bRHVIFi7vS+suTfYfYbaXSfbfdJv/ahdwdsxheWhWwzSrlMgAFLGgYCHsSGnNkWIEm6nItJXMuA3OTrW55hW5aO1XrQcZPbN6mY6wWhjCYhWouF4gebZaz75aOP5AQgmqNoa4tAabaaDwfA4NvQyvE7DYFMEO0JGTDA60kPKeNO6b7wrD105lBYqMmiIBBUwEcethYPViFmiIc1fBMe0KuUkC+2ns+XIrZ5ZUWwAQZZ4ilTIquZQtsFwk7EtJTZ0cmdMfYh2c+0eUcBeXOdDybETBfBfhsCAHWbdj+sA6f7BsMDf5w4X5ESI4AFg4HC+AgH1hqgxotIpNxhsdFCmdcfToaYKwwuic0ihGScfyoDRuYjakANJPxtsmJsoEZMWlY3ZOEF40kFbVkHI2DemlhMfRUHc7JuZOjf4Hpu5NursHhJcFlvofYWVsHO/JL6EWSH/3nOm7AMUdgMTq3Ljp0dPAHCmFmi3BO7jCqb2eOeeXbrA3fpMfSdfB/LHdU0PPJ6p7p6BWyCET0Uxr1USxjBikdcVaHSxXetZUWUKIhC2Js2JjZ6McY+/qUjkJV5/ePPfcIuqW6U6Mrr49FCE/rHtZd5dsINS2tbjVwHdgQ/roxc/dGEehtCUX/mOzYcArZo6hDnhGDyo7nUTEGKjClU8aCn1bk/U0u4XNOiV1QtiBc60BVHmxoCyCpm2gi1OiqXglkeR4tva9zCCAAiEg0DQwm/aVq80BzBCB4eG8O80BWX4Yqbhi8lWFc9k9SasKYZhAig8IJds4ZWQDC//Ki/DDi9X18B+qYFni0K5C6FYzJGsLNDp95BPrgqcduHdVnQ2w0jhhdwKDREwhWYUz+nIAbyPQYpuckAcw6+YYUMgjp3MmOxF4GalYmo5Sx/x+UCetaaDxIjBiaKi70b5dGm+OyRPUIuymAGMgOjsAiLlUKmy888v6Bo6kHJ9dI0JvGmp/DeY1U7WmZqZuTc3cAzI2UGc7UHn+ptjcZti4sES5wldfvymkhh7CUBDie4fEiohhgDwUCFhf7h4EB7F40AhhEPjAw+Cltn6dNDDvsyw5HdYBJ3LlFt2Q67csKghTDnWkwEGZCKrNKQgA1I5yEQGlHRQtR3HRMdr80CWNHGBeBvd1o81RzDlnfRVFEAnneYLx36pQCb0JApUPAIYqchqwzOegPxQzB1B9ie4JXpLTEJU1aSPAE2lq2Rp2hIQavXwP8yEpa0RKE8eoPSDqAPhzikPR2ABxeDtM1UWVTLB7GYTcDugPnCrF4GNB38ogfzZnioMowAAqVhOFwBYkBmalAAIWvitjTgXUO2WFqkhKyvY9+U+XEHyEOyDFF0sg46GnjfiQAX0YAYpAALITkYEhtJDADok3xFDAye/dLoGyl7NAiuwBUAh81xyBYD+OOONsfwG6n8k2ZOFbpfxxobcb+fTApvYBioc4uc9HW+C6QfZcDoMvA/ge7CYTrBEgPgrcsKD8HkIjMgEYthIJEFA9jM1NZAahxfr7ciBlKUQXAMDInNsegAJMIhm7pGYf6TmEODFhc9YIaEIoDrDZhr2HMtTVxYwDSBVNJAfgL2bCEzhewlmlyk7Kj0+AtgycokMC63klBPbVYWEJvrPDGy6wZwdsFYS3D9BgLQwWsLRE8cssTg6DNsL+HnDxBQIlAeWxOHoDiB4I1fHWyQ6bMUOZg44YQPpFgjYB0AfwGwVO5nNyKzvAdBcmuZ0DbuNuD5AjyOi0J+OxlLRMoC8CpkfcrghzJqW657hj0hMTjPRTPQfJc+H3KLtz3J4RCN8vJVADYWJjyBzGVPNOMlwVF6hEh4mQ0bCBDzOdpyYPN+BDxpqM1ZyCQRtCHlYgh41kueLwJ9w4A+VoAsJd/OahC5slRSlwD+FwwUbmJT0z+cNsNglytC9S7Q/HJ0PUJn9luI3Poet2v7W5c05AXAayOBFoDQRUyc4byJOhXCKB53c3vITFHgN6BkozMI4B2ERc3KKo8rN302JWiZeVQDMAulnyJDu8sXBRGwH3w0AREg2AQqpiirKg28MGHUDaXIC+AFi1Q2MdNwzFJZ2MbJMgL2M9SJcUxjyNMVjmzEI1U+J/Asd0NoL8402OTMsQznvS9j+xDnQcUCN9EcAGxfIq4aqK64Ypp086VjGeLYCsh6gJNIPh7irFP0jhqAukXWN+TnESAhfSIP7g1xbpYgcccgWdyFEXdyOoo2gZ2IlGVRkswGBETtmDBXZgMfUKmI7RdZMDOMLXfURQDARjAoI74bEP5F0R8pbR31IoC1FLa4SpADTRiRjDHrdhWE/FfieSFgiM4A2sY5/M0Cl50d/+rBINL11zFBMjST4obkWIv70F+hH4/Gg0R+iHUJhiI0SYmj4CHlcAG4F8MeUkkkBpJs4WSRkXknU0es0gYMGaFLaM0Q+wOGQVBFoQbhAh5IWhMuG2GHD2RqEzkehMwnYSSAHk/CYRMhEbNkJSU2kXMgpRUoQgNKRmoygKqL8dIekQjn/UFEkcgGZEi3OKLHSQN6KBKPUda3nE+i0gDKOwsuiqnect8GDY4N62NixB5A60AIoNAZI5QDsMRZ6udH0oq9bcjNWkPlOcLtTX2zLYfishaAbxGgpKSVGoM+qjDgGnEAPO1A2z1BgEocZUg0BEzSsGQopXwEzG1AWEkYEiY4ADQjSiVnk2eb7CKH5AxDOJPAagLAAqw5QbUXcRRGTkRIiJJklYdXK1wb70h2AQIIYKKHT6cT4BTeFdKnD9QWNEZAeO8OojWqIgbMgMiGfQGq4jw0AzcAmQ3yMQDp9gCQRRL9KyRFADWFBJfGHxOghEXU4gTODtinEXZ2B+2WYOKFSELTM4VqGHD0GE6y9QgZ+fDHmAJSkz0AasIoJORuo8QWinEmHOThyQmoRhxjRfLAJtpWcxMHHNYFEFYkvAHYLdQNIgTApkMlwfyUAVhNJS64CaGhdihJVhiYyP6IebCd7MJqrUh4TTPcJPyf5kZ/SNtHWUuG5mwCNpqcGYO/BBzBN4hlQs2KYNr4GIUmc0nIgtJFkbo7wY02QKNk8xeE+pASXIfkIFQvAKhlGdmVVFASgh6gkcUODlHDjKhO50cUxNqBM5NzhQBsjOqKGKrdF/abYhOs4E8zmD/0nkDQv6ltG6UnGyRWMDRPESGzICD0QIMEDZa+F55Ts1WUqnlnUFnEn0mVkN0MYQhTgCs8cSDPUSD9rUQiDORxKRlHVxED4IEATQSBCxuAzHPGTvFTD7TLpJrSjhaLn6hwEkYCb6CiD0mH8DJM3LoSZJ6HFjzJpY20gZAz6NcpYNFGBu7IkohzKAPshkH7JFqjFA5AE4hRQFIU5Vyqapc1uoFN6fJdcqlODGUS+GBkcq1IlCQVOKkGwBFuAUqSrnKkL9mUA0tlL/SUxQAcFSYtqAuD5kJjxSviBaobwdAYl9pcwBibADQwVcNcqZAChSjmDrY1OPVCyiEE0V4BPOGebRWDLMU2i+Kh0xALIHX5LgdFAGRKTsz26FSQglKalB/TEXOx+prKNtPrhkWIF2eCsNqBONYztTN8tCaADCmgBzAdo2QX5CCkSDRBaU6QIJXghCUlQ2U0QO8NEAADaOSxAAAF1SlNcllIUuFCVLogPw3hflJ8VCL/FJUwJbUskVhKcBLIvKd4oIHtKhF8wiVB0o0V+YKF1vKRW70XgtiSJ08q7lRwlF3MxUO8QcBKk6mZUXk9KD2dMrbSzKMAXgVhCg2kanVLO14l4F4rQ5DKipOyt2GMqmSd9UAmysHGFWFR8iY+IvAcOCxhwrFLZOKRkKbMgBMtZAYAdYsGEr5cgHE60UFbXmiDMJogJDdaBPP/7bg7wG4ZuP5FMFGypAJjPMJsoQyNM9wCfGkBZxGESVFpA6PeZI2HSDEog7ASmQgraGI08xUaFBXN1Mmv81u74rBRWLYL9LIk0STMOzBiUJIGpqSKeOkmpidJyJPSBQIUn6QlIhk5SYVV6F6wYi+BKjPFKhl/REJhkoyKAGhDfCGM0AL4NAEeCPwkBlwyENCDOHgh8QSAaEDcLEAYDLgXwAgDcPuFoDbh9w+4WIBhGXBHh6AwyYVUeFXAOqGAG4AQGhCPACBlwJAT1X6toDwQSAyECGgwH3B8Rg1DAciLQH4ixANwca3yGUkMDCqg1JAVcEeDoAvhVwg2bcEeH3ARqSA8EHEvmuXB5Q+Ibq1QB2u3DLhlwKatALuFDWVJIAqkINXaryjWqu1sQISPuAwirhzIaAKyF2urXbg0A8EF8DiX3AMAjwyUdNCOogAHgg1ENZcLEBfDmQ+1fEbcGhBIA3rzVaARdS+GXBuqNwjqptb5EXUCA+1paw1ZAAwj2Rt18QUQNmrQj7gVAtAE8JWrQixAv18EWgBuG3AQ14gN6i1QIF/V/r1V6gIkFqqjA6qdezqUtUAA== -->

<!-- internal state end -->
<!-- finishing_touch_checkbox_start -->

<details open="true">
<summary>✨ Finishing Touches</summary>

- [ ] <!-- {"checkboxId": "7962f53c-55bc-4827-bfbf-6a18da830691"} --> 📝 Generate Docstrings

</details>

<!-- finishing_touch_checkbox_end -->
<!-- tips_start -->

---

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

<details>
<summary>❤️ Share</summary>

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai)
- [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai)
- [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai)
- [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

</details>

<details>
<summary>🪧 Tips</summary>

### Chat

There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=y-scope/clp&utm_content=1021):

- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
  - `I pushed a fix in commit <commit_id>, please review it.`
  - `Explain this complex logic.`
  - `Open a follow-up GitHub issue for this discussion.`
- Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples:
  - `@coderabbitai explain this code block.`
  -	`@coderabbitai modularize this function.`
- PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
  - `@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.`
  - `@coderabbitai read src/utils.ts and explain its main purpose.`
  - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.`
  - `@coderabbitai help me debug CodeRabbit configuration file.`

### Support

Need help? Create a ticket on our [support page](https://www.coderabbit.ai/contact-us/support) for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

### CodeRabbit Commands (Invoked using PR comments)

- `@coderabbitai pause` to pause the reviews on a PR.
- `@coderabbitai resume` to resume the paused reviews.
- `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
- `@coderabbitai full review` to do a full review from scratch and review all the files again.
- `@coderabbitai summary` to regenerate the summary of the PR.
- `@coderabbitai generate docstrings` to [generate docstrings](https://docs.coderabbit.ai/finishing-touches/docstrings) for this PR.
- `@coderabbitai generate sequence diagram` to generate a sequence diagram of the changes in this PR.
- `@coderabbitai resolve` resolve all the CodeRabbit review comments.
- `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository.
- `@coderabbitai help` to get help.

### Other keywords and placeholders

- Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed.
- Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description.
- Add `@coderabbitai` anywhere in the PR title to generate the title automatically.

### CodeRabbit Configuration File (`.coderabbit.yaml`)

- You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository.
- Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json`

### Documentation and Community

- Visit our [Documentation](https://docs.coderabbit.ai) for detailed information on how to use CodeRabbit.
- Join our [Discord Community](http://discord.gg/coderabbit) to get help, request features, and share feedback.
- Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

</details>

<!-- tips_end -->

@gibber9809 gibber9809 marked this pull request as ready for review June 19, 2025 16:10
@gibber9809 gibber9809 requested review from a team and wraymo as code owners June 19, 2025 16:10
@gibber9809 gibber9809 changed the title feat(clp-s): Add support for delta-encoding integer columns; Use delta-encoding for log_event_idx column. feat(clp-s): Add support for delta-encoding integer columns; Use delta-encoding for log_event_idx column; Remove support for disabling log order. Jun 19, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3ffa53 and 961cca6.

📒 Files selected for processing (19)
  • components/core/CMakeLists.txt (1 hunks)
  • components/core/src/clp_s/ArchiveReader.cpp (3 hunks)
  • components/core/src/clp_s/ArchiveWriter.cpp (1 hunks)
  • components/core/src/clp_s/ColumnReader.cpp (2 hunks)
  • components/core/src/clp_s/ColumnReader.hpp (1 hunks)
  • components/core/src/clp_s/ColumnWriter.cpp (1 hunks)
  • components/core/src/clp_s/ColumnWriter.hpp (1 hunks)
  • components/core/src/clp_s/CommandLineArguments.cpp (0 hunks)
  • components/core/src/clp_s/CommandLineArguments.hpp (0 hunks)
  • components/core/src/clp_s/JsonParser.cpp (4 hunks)
  • components/core/src/clp_s/JsonParser.hpp (0 hunks)
  • components/core/src/clp_s/SchemaReader.cpp (4 hunks)
  • components/core/src/clp_s/SchemaReader.hpp (2 hunks)
  • components/core/src/clp_s/SchemaTree.cpp (1 hunks)
  • components/core/src/clp_s/SchemaTree.hpp (1 hunks)
  • components/core/src/clp_s/SingleFileArchiveDefs.hpp (1 hunks)
  • components/core/src/clp_s/clp-s.cpp (0 hunks)
  • components/core/tests/test-clp_s-delta-encode-log-order.cpp (1 hunks)
  • components/core/tests/test_log_files/test_simple_order.jsonl (1 hunks)
💤 Files with no reviewable changes (4)
  • components/core/src/clp_s/clp-s.cpp
  • components/core/src/clp_s/CommandLineArguments.cpp
  • components/core/src/clp_s/CommandLineArguments.hpp
  • components/core/src/clp_s/JsonParser.hpp
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.

**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • components/core/src/clp_s/SchemaTree.hpp
  • components/core/src/clp_s/SchemaTree.cpp
  • components/core/src/clp_s/SingleFileArchiveDefs.hpp
  • components/core/src/clp_s/ColumnWriter.cpp
  • components/core/src/clp_s/ArchiveWriter.cpp
  • components/core/src/clp_s/ArchiveReader.cpp
  • components/core/src/clp_s/ColumnWriter.hpp
  • components/core/src/clp_s/SchemaReader.cpp
  • components/core/src/clp_s/SchemaReader.hpp
  • components/core/src/clp_s/ColumnReader.cpp
  • components/core/src/clp_s/ColumnReader.hpp
  • components/core/src/clp_s/JsonParser.cpp
  • components/core/tests/test-clp_s-delta-encode-log-order.cpp
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: conventional-commits
🔇 Additional comments (20)
components/core/tests/test_log_files/test_simple_order.jsonl (1)

1-3: Correct test fixture for ordered indices.

The JSONL file provides a simple, ordered sequence of idx values as intended for delta-encoding validation. No issues found.

components/core/src/clp_s/SchemaTree.hpp (1)

44-44: Enumeration extension follows guidelines.

DeltaInteger has been correctly inserted immediately before Unknown, preserving enum ordering constraints and ensuring backward compatibility of encoded values.

components/core/src/clp_s/SchemaTree.cpp (1)

14-15: DeltaInteger mapped to Integer literal type.

The added case NodeType::DeltaInteger returning LiteralType::IntegerT is correct and consistent with integer handling.

components/core/CMakeLists.txt (1)

634-634: New unit test added to build.

Inclusion of test-clp_s-delta-encode-log-order.cpp in the unitTest target is correct, ensuring the delta-encoding behavior is verified.

components/core/src/clp_s/SingleFileArchiveDefs.hpp (1)

14-14: Archive patch version incremented.

Updating cArchivePatchVersion from 1 to 2 accurately reflects the format change required for introducing DeltaInteger.

components/core/src/clp_s/ArchiveWriter.cpp (1)

316-318: LGTM: Proper integration of delta integer support.

The addition of the DeltaInteger case follows the established pattern for handling different node types in the schema writer initialization.

components/core/src/clp_s/ColumnWriter.cpp (1)

26-29: LGTM: Storage implementation follows established pattern.

The store method correctly writes the delta-encoded values using the same approach as other column writers.

components/core/src/clp_s/ArchiveReader.cpp (3)

191-193: LGTM: Proper DeltaInteger support in column reader factory.

The addition of the DeltaInteger case follows the established pattern for creating column readers based on node type.


244-246: LGTM: Consistent DeltaInteger support for unordered columns.

The unordered column handling correctly includes the same DeltaInteger case, maintaining consistency across ordered and unordered column creation paths.


333-335: LGTM: Simplified log event index column marking.

The simplified logic for marking the log event index column is cleaner and works correctly with the generalized BaseColumnReader* type.

components/core/src/clp_s/ColumnWriter.hpp (1)

66-82: LGTM: Well-designed DeltaColumnWriter class declaration.

The class follows the established pattern for column writers with proper inheritance, virtual destructor, and appropriate private members for delta encoding functionality.

components/core/src/clp_s/SchemaReader.hpp (2)

209-211: LGTM: Proper generalization for delta encoding support.

Changing the parameter type from Int64ColumnReader* to BaseColumnReader* appropriately supports the new DeltaColumnReader while maintaining backwards compatibility.


324-324: LGTM: Consistent type generalization for log event index column.

The member variable type change aligns with the method parameter change, providing consistent support for any column reader type as the log event index column.

components/core/src/clp_s/JsonParser.cpp (1)

513-514: LGTM: Clean transition to delta-encoded log event index.

The changes correctly remove the conditional logic and consistently use NodeType::DeltaInteger for the log event index field. The implementation properly adds the log event index value to each message.

Also applies to: 583-587

components/core/src/clp_s/SchemaReader.cpp (2)

32-36: LGTM: Proper timestamp handling for delta-encoded integers.

The implementation correctly adds support for DeltaInteger in timestamp extraction, following the same pattern as the existing Integer case with appropriate casting to DeltaColumnReader.


436-436: LGTM: Consistent JSON serialization support for delta integers.

The changes properly extend JSON serialization to handle DeltaInteger nodes alongside Integer nodes, ensuring both types are treated consistently during template generation.

Also applies to: 521-521, 630-630

components/core/src/clp_s/ColumnReader.cpp (1)

19-25: LGTM: Proper initialization of delta decoding state.

The load method correctly initializes the delta decoding state by setting the current index to 0 and current value to the first element when messages are present.

components/core/src/clp_s/ColumnReader.hpp (1)

94-119: LGTM: Well-designed delta column reader class.

The DeltaColumnReader class declaration follows established patterns with proper inheritance, virtual method overrides, and appropriate private members for maintaining delta decoding state.

components/core/tests/test-clp_s-delta-encode-log-order.cpp (2)

63-113: LGTM: Comprehensive test for delta encoding functionality.

The test effectively validates the delta encoding implementation by testing archive compression/decompression, column reader type verification, and delta decoding correctness with various access patterns. The use of parametrized testing with different starting indices is excellent for ensuring robustness.


24-46: LGTM: Clean test utility for accessing column readers.

The SimpleFilterClass provides a clean way to access the underlying column readers from a SchemaReader, enabling effective testing of the delta column reader functionality.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
components/core/src/clp_s/ColumnReader.cpp (1)

27-41: Bounds checking issue still exists.

The delta decoding logic lacks proper bounds checking that could lead to buffer overflows, as previously identified.

The method still accesses m_values[m_cur_idx + 1] and m_values[m_cur_idx] without validating that these indices are within bounds. Apply the previously suggested fix to add bounds checking.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 961cca6 and 3745c1b.

📒 Files selected for processing (8)
  • components/core/src/clp_s/ArchiveReader.cpp (3 hunks)
  • components/core/src/clp_s/ArchiveWriter.cpp (1 hunks)
  • components/core/src/clp_s/ColumnReader.cpp (2 hunks)
  • components/core/src/clp_s/ColumnReader.hpp (1 hunks)
  • components/core/src/clp_s/ColumnWriter.cpp (1 hunks)
  • components/core/src/clp_s/ColumnWriter.hpp (1 hunks)
  • components/core/src/clp_s/SchemaReader.cpp (4 hunks)
  • components/core/tests/test-clp_s-delta-encode-log-order.cpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.

**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

  • components/core/src/clp_s/ArchiveWriter.cpp
  • components/core/src/clp_s/ColumnWriter.cpp
  • components/core/src/clp_s/ArchiveReader.cpp
  • components/core/src/clp_s/ColumnWriter.hpp
  • components/core/src/clp_s/ColumnReader.cpp
  • components/core/src/clp_s/SchemaReader.cpp
  • components/core/src/clp_s/ColumnReader.hpp
  • components/core/tests/test-clp_s-delta-encode-log-order.cpp
⏰ Context from checks skipped due to timeout of 90000ms (11)
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-13, false)
  • GitHub Check: build-macos (macos-13, true)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build-macos (macos-14, false)
🔇 Additional comments (15)
components/core/src/clp_s/ArchiveWriter.cpp (1)

316-318: LGTM: Clean integration of DeltaInteger support

The addition of the DeltaInteger case follows the established pattern in the switch statement and correctly instantiates a DeltaEncodedInt64ColumnWriter for delta-encoded integer columns.

components/core/src/clp_s/ColumnWriter.cpp (2)

14-24: LGTM: Correct delta encoding implementation

The delta encoding logic is implemented correctly:

  • First value stored directly and m_cur is initialized
  • Subsequent values store the difference (next - m_cur) and update m_cur
  • Returns consistent sizeof(int64_t) for memory accounting

This follows standard delta encoding principles and should provide the compression benefits mentioned in the PR objectives.


26-29: LGTM: Store method correctly handles delta-encoded data

The store method appropriately writes the delta-encoded values as raw bytes to the compressor, following the same pattern as other column writers.

components/core/src/clp_s/ArchiveReader.cpp (3)

191-193: LGTM: Consistent DeltaInteger support in ordered columns

The addition of the DeltaInteger case correctly creates a DeltaEncodedInt64ColumnReader instance, following the same pattern as other column types.


244-246: LGTM: Consistent DeltaInteger support in unordered columns

The DeltaInteger case is correctly handled in unordered columns, maintaining consistency with the ordered column handling.


333-335: LGTM: Simplified log event index column marking

The removal of the dynamic cast check and direct use of column_reader is cleaner and works correctly since the method signature was generalized to accept BaseColumnReader*.

components/core/src/clp_s/ColumnWriter.hpp (1)

66-82: LGTM: Well-structured DeltaEncodedInt64ColumnWriter class

The class declaration follows established patterns from other column writers and correctly includes:

  • Proper inheritance from BaseColumnWriter
  • Brace initialization for m_cur{} ensuring it starts at 0
  • Appropriate member variables for delta encoding implementation
  • Consistent virtual method overrides

The structure aligns well with the existing codebase conventions.

components/core/src/clp_s/SchemaReader.cpp (4)

32-36: LGTM: Correct timestamp extraction for DeltaInteger columns

The implementation correctly casts to DeltaEncodedInt64ColumnReader* and extracts the int64_t value for timestamp processing, following the same pattern as regular Integer columns.


436-441: LGTM: Consistent DeltaInteger handling in structured array templates

The addition of DeltaInteger alongside Integer in the case statement correctly treats both types identically for JSON serialization purposes, since they both represent integer values with different storage encodings.


521-526: LGTM: Consistent DeltaInteger handling in structured object templates

The DeltaInteger case correctly follows the same logic as Integer for JSON field generation, maintaining consistency across the serialization logic.


630-635: LGTM: Consistent DeltaInteger handling in JSON template generation

The handling of DeltaInteger alongside Integer ensures consistent JSON serialization behavior across all template generation methods, correctly treating them as integer fields.

components/core/src/clp_s/ColumnReader.cpp (3)

19-25: LGTM! Initialization logic is correct.

The load method properly initializes the delta-encoded column reader with the first value as the base and sets up tracking variables.


43-47: LGTM! Delegation pattern is correctly implemented.

The extract_value method properly delegates to the bounds-checked get_value_at_idx method.


58-63: LGTM! String conversion logic is correct.

The extract_string_value_into_buffer method correctly converts the delta-decoded value to string format.

components/core/src/clp_s/ColumnReader.hpp (1)

94-125: Well-designed class with proper documentation.

The DeltaEncodedInt64ColumnReader class follows the established pattern of other column readers with:

  • Proper inheritance and virtual method overrides
  • Clear documentation for the private helper method
  • Appropriate member variables for delta decoding state

The class interface is consistent with the existing codebase architecture.

@gibber9809 gibber9809 changed the title feat(clp-s): Add support for delta-encoding integer columns; Use delta-encoding for log_event_idx column; Remove support for disabling log order. feat(clp-s): Add support for delta-encoding integer columns; Use delta-encoding for log_event_idx column. Jun 23, 2025
@gibber9809 gibber9809 changed the title feat(clp-s): Add support for delta-encoding integer columns; Use delta-encoding for log_event_idx column. feat(clp-s): Add support for delta-encoding integer columns; Use delta-encoding for the log_event_idx column. Jun 23, 2025
wraymo
wraymo previously approved these changes Jun 23, 2025
Copy link
Contributor

@wraymo wraymo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me.

Comment on lines +32 to +36
} else if (m_timestamp_column->get_type() == NodeType::DeltaInteger) {
m_get_timestamp = [this]() {
return std::get<int64_t>(static_cast<DeltaEncodedInt64ColumnReader*>(m_timestamp_column)
->extract_value(m_cur_message));
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we apply this encoding to an integer timestamp column in a future PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that that will probably be worthwhile. We can experiment with that when we implement timestamp normalization.

@wraymo
Copy link
Contributor

wraymo commented Jun 23, 2025

For the title, what about

feat(clp-s): Add delta-encoding support for integer columns; Use it for log_event_idx column.

@gibber9809 gibber9809 changed the title feat(clp-s): Add support for delta-encoding integer columns; Use delta-encoding for the log_event_idx column. feat(clp-s): Add delta-encoding support for integer columns; Use it for the log_event_idx column. Jun 23, 2025
Comment on lines 74 to 112
std::vector<clp_s::Path> archive_paths;
REQUIRE(clp_s::get_input_archives_for_raw_path(
std::string{cTestDeltaEncodeOrderArchiveDirectory},
archive_paths
));
REQUIRE(1 == archive_paths.size());
clp_s::ArchiveReader archive_reader;
REQUIRE_NOTHROW(archive_reader.open(archive_paths.back(), clp_s::NetworkAuthOption{}));
REQUIRE_NOTHROW(archive_reader.read_dictionaries_and_metadata());
REQUIRE_NOTHROW(archive_reader.open_packed_streams());
auto mpt = archive_reader.get_schema_tree();
auto log_event_idx_node_id = mpt->get_metadata_field_id(clp_s::constants::cLogEventIdxName);
REQUIRE(-1 != log_event_idx_node_id);
std::vector<std::shared_ptr<clp_s::SchemaReader>> schema_readers;
REQUIRE_NOTHROW(schema_readers = archive_reader.read_all_tables());
REQUIRE(1 == schema_readers.size());
auto schema_reader = schema_readers.back();
REQUIRE(cNumEntries == schema_reader->get_num_messages());
SimpleFilterClass simple_filter_class;
schema_reader->initialize_filter(&simple_filter_class);
clp_s::BaseColumnReader* log_event_idx_reader{nullptr};
for (auto* column_reader : simple_filter_class.get_column_readers()) {
if (log_event_idx_node_id == column_reader->get_id()) {
log_event_idx_reader = column_reader;
break;
}
}
REQUIRE(nullptr != log_event_idx_reader);
REQUIRE(clp_s::NodeType::DeltaInteger == log_event_idx_reader->get_type());
REQUIRE(nullptr != dynamic_cast<clp_s::DeltaEncodedInt64ColumnReader*>(log_event_idx_reader));
// Test forwards and backwards seeks on `DeltaEncodedInt64ColumnReader`.
size_t i{start_index};
for (size_t num_iterations{0ULL}; num_iterations < cNumEntries; ++num_iterations) {
int64_t val{};
REQUIRE_NOTHROW(val = std::get<int64_t>(log_event_idx_reader->extract_value(i)));
REQUIRE(val == static_cast<int64_t>(i));
i = (i + 1) % cNumEntries;
}
REQUIRE_NOTHROW(archive_reader.close());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could be broken up into blocks for readability.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
components/core/tests/test-clp_s-delta-encode-log-order.cpp (3)

22-22: Documentation needed for test constant.

The constant cNumEntries{3} should include a comment explaining its significance and relationship to the test data file.


25-27: Improve class documentation.

The comment could be more concise by removing the redundant "This class is" phrase.


63-114: Consider enhancing test coverage and readability.

The test provides good basic coverage but could benefit from:

  1. Breaking the test logic into smaller, more focused blocks for better readability
  2. Adding edge case testing (empty datasets, single entries, boundary conditions)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00239a4 and 2823fa2.

📒 Files selected for processing (1)
  • components/core/tests/test-clp_s-delta-encode-log-order.cpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.

**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • components/core/tests/test-clp_s-delta-encode-log-order.cpp
🧠 Learnings (2)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#698
File: components/core/src/clp/streaming_archive/Constants.hpp:9-9
Timestamp: 2025-01-28T03:02:30.542Z
Learning: In the CLP project's archive version format (semver), the patch version uses 16 bits (uint16_t), while major version uses 8 bits and minor version uses 16 bits.
components/core/tests/test-clp_s-delta-encode-log-order.cpp (19)
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-clp_s-end_to_end.cpp:109-110
Timestamp: 2024-11-29T22:50:17.206Z
Learning: In `components/core/tests/test-clp_s-end_to_end.cpp`, the success of `constructor.store()` is verified through `REQUIRE` statements and subsequent comparisons.
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:14-14
Timestamp: 2024-10-14T03:42:10.355Z
Learning: In the file `components/core/tests/test-ffi_KeyValuePairLogEvent.cpp`, including `<json/single_include/nlohmann/json.hpp>` is consistent with the project's coding standards.
Learnt from: LinZhihao-723
PR: y-scope/clp#557
File: components/core/tests/test-ir_encoding_methods.cpp:1216-1286
Timestamp: 2024-10-13T09:27:43.408Z
Learning: In the unit test case `ffi_ir_stream_serialize_schema_tree_node_id` in `test-ir_encoding_methods.cpp`, suppressing the `readability-function-cognitive-complexity` warning is acceptable due to the expansion of Catch2 macros in C++ tests, and such test cases may not have readability issues.
Learnt from: LinZhihao-723
PR: y-scope/clp#570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Learnt from: LinZhihao-723
PR: y-scope/clp#558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:85-103
Timestamp: 2024-10-14T03:42:53.145Z
Learning: The function `assert_kv_pair_log_event_creation_failure` is correctly placed within the anonymous namespace in `test-ffi_KeyValuePairLogEvent.cpp`.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#614
File: components/core/tests/test-StreamingCompression.cpp:45-54
Timestamp: 2024-12-04T15:50:37.827Z
Learning: In `components/core/tests/test-StreamingCompression.cpp`, within the `compress` function, compressing the same data repeatedly by passing the same `src` pointer without advancing is intentional to test the compressor with the same data multiple times.
Learnt from: gibber9809
PR: y-scope/clp#630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:13.322Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, validation and exception throwing are unnecessary in the `get_archive_node_id` method when processing nodes, and should not be added.
Learnt from: haiqi96
PR: y-scope/clp#619
File: components/core/src/clp/clp/decompression.cpp:313-313
Timestamp: 2024-12-05T16:32:21.507Z
Learning: In the C++ `FileDecompressor` implementations at `components/core/src/clp/clp/FileDecompressor.cpp` and `components/core/src/glt/glt/FileDecompressor.cpp`, the `temp_output_path` variable and associated logic are used to handle multiple compressed files with the same name, and should be kept. This logic is separate from the temporary output directory code removed in PR #619 and is necessary for proper file handling.
Learnt from: LinZhihao-723
PR: y-scope/clp#593
File: components/core/tests/test-NetworkReader.cpp:216-219
Timestamp: 2024-11-15T03:15:45.919Z
Learning: In the `network_reader_with_valid_http_header_kv_pairs` test case in `components/core/tests/test-NetworkReader.cpp`, additional error handling for JSON parsing failures is not necessary, as the current error message is considered sufficient.
Learnt from: LinZhihao-723
PR: y-scope/clp#557
File: components/core/src/clp/ffi/ir_stream/utils.hpp:0-0
Timestamp: 2024-10-18T02:31:18.595Z
Learning: In `components/core/src/clp/ffi/ir_stream/utils.hpp`, the function `size_dependent_encode_and_serialize_schema_tree_node_id` assumes that the caller checks that `node_id` fits within the range of `encoded_node_id_t` before casting.
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/BufferedFileReader.cpp:96-106
Timestamp: 2024-10-24T14:45:26.265Z
Learning: In `components/core/src/clp/BufferedFileReader.cpp`, refactoring the nested error handling conditions may not apply due to the specific logic in the original code.
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:769-779
Timestamp: 2024-10-07T21:16:41.660Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, when handling errors in `parse_from_ir`, prefer to maintain the current mix of try-catch and if-statements because specific messages are returned back up in some cases.
Learnt from: gibber9809
PR: y-scope/clp#630
File: components/core/src/clp_s/JsonParser.cpp:702-703
Timestamp: 2024-12-10T16:03:08.691Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `get_archive_node_id` function, validation and exception throwing for UTF-8 compliance of `curr_node.get_key_name()` are unnecessary and should be omitted.
Learnt from: gibber9809
PR: y-scope/clp#584
File: components/core/src/clp_s/SchemaTree.hpp:91-94
Timestamp: 2024-11-12T18:47:03.828Z
Learning: In `components/core/src/clp_s/SchemaTree.hpp`, the `SchemaNode` class uses `std::unique_ptr<char[]> m_key_buf` and `std::string_view m_key_name` to ensure that references to `m_key_name` remain valid even after `SchemaNode` is move-constructed.
Learnt from: gibber9809
PR: y-scope/clp#584
File: components/core/src/clp_s/SchemaTree.hpp:171-171
Timestamp: 2024-11-12T18:46:20.933Z
Learning: In `components/core/src/clp_s/SchemaTree.hpp`, it's acceptable to use `std::string_view` as keys in `m_node_map` because `SchemaNode`'s `m_key_name` remains valid even after move operations or reallocations, preventing dangling references.
Learnt from: gibber9809
PR: y-scope/clp#584
File: components/core/src/clp_s/SchemaTree.hpp:40-55
Timestamp: 2024-11-12T18:56:31.067Z
Learning: In `components/core/src/clp_s/SchemaTree.hpp`, within the `SchemaNode` class, the use of `std::string_view` for `m_key_name` referencing `m_key_buf` is intentional to ensure that references to the key name remain valid even after move construction.
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:756-765
Timestamp: 2024-10-08T15:52:50.753Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `parse_from_ir()` function, reaching the end of log events in a given IR is not considered an error case. The errors `std::errc::no_message_available` and `std::errc::result_out_of_range` are expected signals to break the deserialization loop and proceed accordingly.
Learnt from: AVMatthews
PR: y-scope/clp#543
File: components/core/src/clp_s/JsonParser.cpp:735-794
Timestamp: 2024-10-07T21:35:04.362Z
Learning: In `components/core/src/clp_s/JsonParser.cpp`, within the `parse_from_ir` method, encountering errors from `kv_log_event_result.error()` aside from `std::errc::no_message_available` and `std::errc::result_out_of_range` is anticipated behavior and does not require additional error handling or logging.
🧬 Code Graph Analysis (1)
components/core/tests/test-clp_s-delta-encode-log-order.cpp (1)
components/core/src/clp_s/ColumnReader.hpp (25)
  • reader (35-35)
  • reader (80-80)
  • reader (103-103)
  • reader (136-136)
  • reader (159-159)
  • reader (191-191)
  • reader (241-241)
  • reader (278-278)
  • cur_message (46-46)
  • cur_message (54-54)
  • cur_message (62-65)
  • cur_message (63-63)
  • cur_message (84-86)
  • cur_message (88-88)
  • cur_message (107-109)
  • cur_message (111-111)
  • cur_message (140-142)
  • cur_message (144-144)
  • cur_message (163-165)
  • cur_message (167-167)
  • cur_message (197-199)
  • cur_message (201-201)
  • cur_message (204-204)
  • cur_message (211-211)
  • DeltaInteger (105-105)
⏰ Context from checks skipped due to timeout of 90000ms (11)
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: build-macos (macos-14, true)
🔇 Additional comments (1)
components/core/tests/test-clp_s-delta-encode-log-order.cpp (1)

106-112: LGTM: Seek testing logic is well-designed.

The parameterised testing with different start indices combined with the modular arithmetic ensures comprehensive coverage of forward/backward seeking functionality in the delta-encoded column reader.

@gibber9809 gibber9809 requested a review from kirkrodrigues June 26, 2025 15:26
@kirkrodrigues kirkrodrigues changed the title feat(clp-s): Add delta-encoding support for integer columns; Use it for the log_event_idx column. feat(clp-s): Add delta-encoding support for integer columns; Use it for the log_event_idx column. Jun 26, 2025
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the changes I requested; deferring to @wraymo's approval for everything else.

@kirkrodrigues kirkrodrigues merged commit f2deb21 into y-scope:main Jun 26, 2025
21 checks passed
quinntaylormitchell pushed a commit to quinntaylormitchell/clp that referenced this pull request Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants