Skip to content

Conversation

@yanghua
Copy link
Collaborator

@yanghua yanghua commented Nov 18, 2025

As a step to distinguish the row ID and the row address, we should rename RowIdSelection to RowAddrSelection. To show its effect more clearly.

@yanghua yanghua marked this pull request as ready for review November 18, 2025 03:43
@jackye1995
Copy link
Contributor

Try reopen PR to fix CI

@jackye1995 jackye1995 closed this Nov 18, 2025
@jackye1995 jackye1995 reopened this Nov 18, 2025
@yanghua yanghua force-pushed the rename-RowIdSelection branch from 130a0c7 to f10a1ce Compare November 19, 2025 07:51
@yanghua yanghua force-pushed the rename-RowIdSelection branch from f10a1ce to cb9e613 Compare November 20, 2025 12:28
/// fragment k is selected. If there is a pair (k, Partial(v)) then the
/// fragment k has the selected rows in v.
inner: BTreeMap<u32, RowIdSelection>,
inner: BTreeMap<u32, RowAddrSelection>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this appropriate? The docs above say that it can be row ids. Do stable row ids not use this structure at all?

/// These row ids may either be stable-style (where they can be an incrementing
/// u64 sequence) or address style, where they are a fragment id and a row offset.
/// When address style, this supports setting entire fragments as selected,
/// without needing to enumerate all the ids in the fragment.

/// fragment k is selected. If there is a pair (k, Partial(v)) then the
/// fragment k has the selected rows in v.
inner: BTreeMap<u32, RowIdSelection>,
inner: BTreeMap<u32, RowAddrSelection>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this one we should use the term row ID? Because this is more used to accommodate stable row ID, and for row address technically directly use RoaringTreemap?

Copy link
Collaborator Author

@yanghua yanghua Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems no matter RowIdSelection nor RowAddrSelection can cover the two semantics of row ID and row address. There is a full plan in my mind(sorry, did not write it down before):

It would be better to split the responsibility of the struct for distinguishing row ID(stable) and row address.

  • for row address:

    • RowIdTreeMap -> RowAddrMap(BTreeMap<u32, RowAddrSelection>)
    • RowIdSelectionRowAddrSelection(Full | Partial(RoaringBitmap))
    • SearchResult(scalar.rs) → RowAddrSearchResult (Exact(RowAddrMap) | AtMost(RowAddrMap))
    • RowIdMaskRowAddrMask
  • for (stable) row id:

    • add rowid::RowIdSet (RoaringTreemap,64-bit set)
    • add rowid::RowIdMask
    • add RowIdResolver(row_id -> row_addr)
  • for both:

    • add RowSetOps for RowIdSet and RowAddrMap

WDYT? cc @wjones127

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's okay. Could you put this design in an issue?

Remember that you're collaborating with other developers and they need to be able to follow what you are doing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Have filed a ticket: #5326

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants