Uncaught exceptions on fuzzed but valid ELF files

I've read previous, similar issues about fuzzing #529  and corrupted ELF files #482. I've followed the rule of thumb as laid out by sevaa there

> In general, our preferred rule of thumb is - can the GNU or the LLVM tools (e. g. readelf) parse the binary with no errors? If they do, and pyelftools throws an error, it's issue with pyelftools. Else, it's the issue with the binary.

Well, using the fuzzer Atheris I found 19 uncaught exceptions. I've logged all offending ELF files, and only logged them if `llvm-readelf` returned without exit. 

To be precise, I made this check as such: 

```Python
            result = subprocess.run(['llvm-readelf', '--addrsig', '--arch-specific', '--bb-addr-map', '--demangle', '--dependent-libraries', '--dyn-relocations', '--dyn-symbols', '--dynamic-table', '--cg-profile', '--histogram', '--elf-linker-options', '--section-groups', '--expand-relocs', '--file-header', '--gnu-hash-table', '--hash-symbols', '--elf-output-style=JSON', '--pretty-print', '--hash-table', '--headers', '--needed-libs', '--notes', '--program-headers', '--relocations', '--sections', '--section-data', '--section-mapping', '--section-relocations', '--section-symbols', '--stackmap', '--stack-sizes', '--symbols', '--unwind', '--version-info', temp_file_path], capture_output=True, text=False)
            if result.returncode == 0:
                # Log the crash only if llvm-readelf succeeded
                log_crash(e, data, CRASH_DIR)
            else:
                pass
                # print(f"llvm-readelf failed to parse the file: {temp_file_path}")
```

I tried to make `llvm-parse` parse as much as possible of the ELF file this way, to make the comparison as fair as possible. 

I logged the crashing ELF files, as well as a JSON containing more information (such as the stack trace), for example: 

```JSON
{
    "exception_type": "<class 'UnicodeDecodeError'>",
    "exception_message": "'utf-8' codec can't decode byte 0xff in position 4: invalid start byte",
    "traceback": "Traceback (most recent call last):\n  File \"/root/pyelftools/fuzzing.py\", line 111, in TestOneInput\n    readelf.main()\n  File \"/root/pyelftools/scripts/readelf.py\", line 1955, in main\n    readelf.display_arch_specific()\n  File \"/root/pyelftools/scripts/readelf.py\", line 800, in display_arch_specific\n    self._display_arch_specific_arm()\n  File \"/root/pyelftools/scripts/readelf.py\", line 1830, in _display_arch_specific_arm\n    self._display_attributes(attr_sec, describe_attr_tag_arm)\n  File \"/root/pyelftools/scripts/readelf.py\", line 1821, in _display_attributes\n    for attr in ss.iter_attributes():\n  File \"/root/pyelftools/elftools/elf/sections.py\", line 335, in iter_attributes\n    for attribute in self._make_attributes():\n  File \"/root/pyelftools/elftools/elf/sections.py\", line 360, in _make_attributes\n    yield self.attribute(self.structs, self.stream)\n          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/root/pyelftools/elftools/elf/sections.py\", line 495, in __init__\n    self.value = struct_parse(structs.Elf_ntbs('value',\n                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/root/pyelftools/elftools/common/utils.py\", line 36, in struct_parse\n    return struct.parse_stream(stream)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/root/pyelftools/elftools/construct/core.py\", line 190, in parse_stream\n    return self._parse(stream, Container())\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/root/pyelftools/elftools/construct/core.py\", line 261, in _parse\n    return self.subcon._parse(stream, context)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/root/pyelftools/elftools/construct/core.py\", line 276, in _parse\n    return self._decode(self.subcon._parse(stream, context), context)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/root/pyelftools/elftools/construct/adapters.py\", line 238, in _decode\n    return StringAdapter._decode(self, b''.join(obj[:-1]), context)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/root/pyelftools/elftools/construct/adapters.py\", line 153, in _decode\n    obj = obj.decode(self.encoding)\n          ^^^^^^^^^^^^^^^^^^^^^^^^^\nUnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 4: invalid start byte\n",
    "input_data": "7f454c46010101000000000000000000010028000100000000000000000000003c010000000000053400000000002800080007004174000000616561626900016a00000043342e30380040000472616e64ffffffffffffff3f6370750006010753080109010a010b010c010d010e010f0110011101120213011401150116011701180119011a011b011c011d011e011f012001676e75002201240126012a012c0141060b0042014403000000000000000000000000000000000000000000000000000000000000000300010000000000000000000000000003000200000000000000000000000000030003000000000000000000000000000300040000002e73796d746162002e737472746162002e7368737472746162002e74657874002e64617461002e627373002e41524d2e6174747269627574657300000000000000000000000000000000000000000000000000000000000000000000000000000000000000001b00000001000000060000000000000034000000000000000000000000000000010000000000000021000000010000000300000000000000340000000000000000000000000000000100000000000000270000000800000003000000000000003400000000000000000000000000000001000000000000002c00000003000070000000000000000034000000750000000000000000000000010000000000000001000000020000000000000000000000ac000000500000000600000005000000040000001000000009000000030000000000000000000000fc000000010000000000000000000000010000000000000011000000030000000000000000000000fd0000003c00000000000000000000000e00000000000000",
    "crashing_file_path": "elftools/construct/adapters.py"
}
```

Now I wonder, given that `llvm-parseelf` seems to parse them with no error, are you interested in further investigating these crashes?  If so, how can I best provide them to you in a way that you find convenient?

I don't mean to simply drop the dirty work on you here. I've looked into the crashes but I lack thorough understanding of the ELF format to make a well-informed decision. I hope that comparing with `llvm-readelf` makes sense, if you'd like me to do any other comparison before storing them as a crash, let me know! Also, if you're interested in this fuzzing setup, I´d be happy to help you set it up. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uncaught exceptions on fuzzed but valid ELF files #612

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uncaught exceptions on fuzzed but valid ELF files #612

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions