Skip to content

fix: shift KV/Form graph cell page numbers during DoclingDocument.concatenate#521

Merged
cau-git merged 1 commit intomainfrom
cau/fix-kv-concatenate-on-first-page
Feb 23, 2026
Merged

fix: shift KV/Form graph cell page numbers during DoclingDocument.concatenate#521
cau-git merged 1 commit intomainfrom
cau/fix-kv-concatenate-on-first-page

Conversation

@cau-git
Copy link
Member

@cau-git cau-git commented Feb 20, 2026

DoclingDocument.concatenate was updating DocItem.prov.page_no with page_delta, but it did not update GraphCell.prov.page_no inside KeyValueItem/FormItem graphs.
As a result, after concatenation, key-value/form graph cells from later documents could still point to page 1, causing incorrect visualization/page placement.

What changed

  • In _DocIndex.index (docling_core/types/doc/document.py), when copying a KeyValueItem or FormItem, we now also shift cell.prov.page_no for every graph cell by page_delta.
  • Added regression test test_concatenate_shifts_graph_cell_pages_for_keyvalue_and_form in test/test_docling_doc.py to verify concatenating two 1-page docs yields:
    • first KV/Form graph cells on page 1
    • second KV/Form graph cells on page 2

Why

This ensures graph-cell provenance remains consistent with concatenated page numbering, fixing KV/Form items appearing on the wrong page after merge.

@github-actions
Copy link
Contributor

DCO Check Passed

Thanks @cau-git, all your commits are properly signed off. 🎉

@mergify
Copy link

mergify bot commented Feb 20, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

🟢 Require two reviewer for test updates

Wonderful, this rule succeeded.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

@cau-git cau-git changed the title fix: When concatenating docs, adjust page numbers the GraphCell elems fix: shift KV/Form graph cell page numbers during DoclingDocument.concatenate Feb 20, 2026
@cau-git cau-git marked this pull request as ready for review February 20, 2026 16:49
@cau-git cau-git requested a review from vagenas February 20, 2026 16:49
@dosubot
Copy link

dosubot bot commented Feb 20, 2026

Related Documentation

Checked 17 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Member

@dolfim-ibm dolfim-ibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@cau-git cau-git merged commit 6a04db7 into main Feb 23, 2026
12 checks passed
@cau-git cau-git deleted the cau/fix-kv-concatenate-on-first-page branch February 23, 2026 09:35
Matteo-Omenetti pushed a commit that referenced this pull request Mar 11, 2026
…catenate (#521)

fix: When concatenating docs, adjust page numbers the GraphCell elements appear on

Signed-off-by: Christoph Auer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants