Skip to content

Linear scanning in CompileUnit.iter_DIEs()#625

Merged
eliben merged 2 commits intoeliben:mainfrom
sevaa:fix_iterflat
Sep 1, 2025
Merged

Linear scanning in CompileUnit.iter_DIEs()#625
eliben merged 2 commits intoeliben:mainfrom
sevaa:fix_iterflat

Conversation

@sevaa
Copy link
Contributor

@sevaa sevaa commented Aug 30, 2025

CompileUnit.iter_DIEs() is unnecessarily convoluted in terms of parse logic. Its purpose it to return the DIEs as they appear in the info section, but instead it performs recursive tree traversal via CompileUnit.iter_DIE_children(), which contains the more complicated logic of going over immediate children of a DIE. On a siblingless DIE tree (which is rare in practice but allowed by the format), this will contain extra work of parsing the whole subtree only to throw away the children of children.

Physically, DIEs are stored linearly. Logically, they are a tree. iter_DIEs() is expected to return a flat collection, but it does so by building a tree (out of a flat collection) and then flattening it. This PR changes iter_DIEs() to parse sequentially instead.

As a matter of performance, it shaves 5-7% from the linear scan time using iter_DIEs(). Tested on one rather artificial example; the speedup on real life DWARF might vary. Does not fix #623, but somewhat helps them.

As a downside, it duplicates the caching and the tree building logic already present in CompileUnit.iter_DIE_children().


On a side note, one reason I got started with this, I have an extensive collection of crash reports with a very similar condition that I've only recently tracked down to a root cause - the DW_AT_sibling on levels below the first had a bogus value. All affected binaries are PowerPC ELF files, so my working theory is that this is - or was - a bug in a PPC compiler. No way to reproduce, but I've been chasing this for a while. #580 was about that too.

Point is, the current version of iter_DIEs() relies on DW_AT_sibling because it's built on top of tree traversal. The one in this PR does not, so it's not susceptible to the issue (the immediate children traversal function still is). Far it be from me to introduce special cases for malformed DWARF, but I get my wins where I can :)


On a side side note, you can shave good 30% off scan time if you skip setting the parent/terminator, caching, and the possibility of supplemental DWARF - true firehose. That would look like this:

    def iter_DIEs_really_fast(self):
        stm = self.dwarfinfo.debug_info_sec.stream
        pos = self.cu_die_offset
        end_pos = self.cu_offset + self.size
        while pos < end_pos:
            die = DIE(self, stm, pos)
            yield die
            pos += die.size

I'm of half the mind to throw this in just for the benefit of @chall1123 with #623 :)

Copy link
Owner

@eliben eliben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd also be good to add the extra context from the issue to the PR description / commit info, and link to the issue

parent = die

if die.tag == 'DW_TAG_imported_unit' and self.dwarfinfo.supplementary_dwarfinfo:
# Falls back to subtree traversal in the supplemental DWARF. Any way to streamline that too?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what's happening here. Aren't we supposed to yield from _iter_die_subtree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean instead of
supp_die.cu._iter_DIE_subtree(supp_die)
it should have been
yield supp_die.cu._iter_DIE_subtree(supp_die)
? That was my first stab too, but apparently when you do that (at least in Python 3.13), a generator object is yielded to the caller, instead of its contents.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yield from?

Otherwise, I don't see what this call is even doing. _iter_DIE_subtree yields stuff, it's not usually called for side effects.

If you remove this call, do tests still pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yield from is right. And the autotest didn't catch that. It does now.

@sevaa
Copy link
Contributor Author

sevaa commented Aug 30, 2025

It'd also be good to add the extra context from the issue to the PR description / commit info, and link to the issue

What other context and what issue? The ticket about the crashes caused by bogus DW_AT_sibling values is in my own private bug tracker, not on Github. I was going to look around if there was a known compiler ticket about this, but I haven't yet.

@eliben
Copy link
Owner

eliben commented Aug 31, 2025

It'd also be good to add the extra context from the issue to the PR description / commit info, and link to the issue

What other context and what issue? The ticket about the crashes caused by bogus DW_AT_sibling values is in my own private bug tracker, not on Github. I was going to look around if there was a known compiler ticket about this, but I haven't yet.

I mean the context from this comment: #623 (comment)

@eliben eliben merged commit 0d51efa into eliben:main Sep 1, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

function iter_DIEs is verify slow, 1200000 dies takes 45 seconds

2 participants