-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Handle PPTX shapes where position is None #1161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for looking into this! |
|
It looks like there is a stack overflow/infinite recursion error now on the test file. Moreover, I think that skipping shapes is problematic -- is it possible those shapes are tables, pictures, etc? Maybe have the sort key be the tuple: (float('inf') if shape.top is None else shape.top, float('inf') if shape.left is None else shape.left), so that shapes with missing coordinates always get listed last (but because it's a stable sort, they appear in the order they were indexed in the file) |
|
From my research, the only shapes with To your point about listing the shapes last: It seems almost all the shapes with missing coordinates are slide titles using placeholder shapes inserted using a slide template. I've seen some instances where a placeholder for slide content is also missing coordiantes. Based on that, i would actually be inclined to list them first |
|
I fixed the recursion error, was a mistake in copying the sorting code between grouped shapes and normal shapes |
Hi, is there an ETA for merging this PR? Some of our PowerPoints currently cannot be converted with markitdown because of this bug |
* Handle shapes where position is None * Fixed recursion error, and place no-coord shapes at front

My previous PR for sorting powerpoint shapes relied on the
topandleftattributes of the pptx shapes, however I didn't account for shapes havingNoneas theirtop,left,heightorwidthattributes. We'll need to think about what to do with these shapes, as sometimes they can still contain text, however they don't exist on the slide.Currently, I've filtered out shapes with
Nonein their positional attributes, however I'm open to changing how they're dealt with.I think the options are: