PDF not having hierarchial structure.

### Bug
When processing a .docx file with the simple pipeline, the doctags output correctly preserves the hierarchical structure of headings (section_header_level_1 through section_header_level_4).

However, when the same document is converted to .pdf and processed using pdfpipeline(), the doctags output only contains section_header_level_1. All nested heading levels (2–4) are flattened or lost.

Does this mean hybrid chunking for docling documents is not working here ?(Using only  section_header_level_(1) for chunking?)


...

### Docling version
Docling version: 2.32.0
...

### Python version
Python 3.10.12

...



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDF not having hierarchial structure. #2121

Bug

Docling version

Python version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PDF not having hierarchial structure. #2121

Description

Bug

Docling version

Python version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions