Skip to content

fix: accept relative URIs in PdfHyperlink without validation failure#520

Merged
PeterStaar-IBM merged 1 commit intodocling-project:mainfrom
Ultizan:fix/pdf-hyperlink-relative-uri
Feb 23, 2026
Merged

fix: accept relative URIs in PdfHyperlink without validation failure#520
PeterStaar-IBM merged 1 commit intodocling-project:mainfrom
Ultizan:fix/pdf-hyperlink-relative-uri

Conversation

@Ultizan
Copy link
Contributor

@Ultizan Ultizan commented Feb 18, 2026

PDF hyperlinks may contain relative paths, internal bookmarks, or fragment-only references that are not valid absolute URLs. The strict AnyUrl validation on PdfHyperlink.uri caused the entire page preprocess stage to fail when such URIs were encountered, resulting in empty documents and lost content.

Change uri type to Union[AnyUrl, str] with a field_validator that attempts AnyUrl parsing first (preserving structured metadata like scheme/host/path) and falls back to str for non-absolute URIs.

PDF hyperlinks may contain relative paths, internal bookmarks, or
fragment-only references that are not valid absolute URLs. The strict
AnyUrl validation on PdfHyperlink.uri caused the entire page preprocess
stage to fail when such URIs were encountered, resulting in empty
documents and lost content.

Change uri type to Union[AnyUrl, str] with a field_validator that
attempts AnyUrl parsing first (preserving structured metadata like
scheme/host/path) and falls back to str for non-absolute URIs.

Signed-off-by: Ultizan <ultizan@gmail.com>
@github-actions
Copy link
Contributor

DCO Check Passed

Thanks @Ultizan, all your commits are properly signed off. 🎉

@mergify
Copy link

mergify bot commented Feb 18, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

🟢 Require two reviewer for test updates

Wonderful, this rule succeeded.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

@dosubot
Copy link

dosubot bot commented Feb 18, 2026

Related Documentation

Checked 17 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Member

@dolfim-ibm dolfim-ibm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@PeterStaar-IBM PeterStaar-IBM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@PeterStaar-IBM PeterStaar-IBM merged commit 6032c7c into docling-project:main Feb 23, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants