-
Notifications
You must be signed in to change notification settings - Fork 755
Refactor PEFile and PEHeader to use ReadOnlySpan exclusively with zero-copy buffer sharing #2317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: brianrob <[email protected]>
brianrob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot please address this comment.
| public PEHeader(ReadOnlySpan<byte> peFileData) | ||
| { | ||
| // We need to copy the span data to a byte array since we can't store spans as fields | ||
| m_buffer = peFileData.ToArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to avoid copying the buffer here. Would it make more sense to fold PEHeader into PEFile so that the copy doesn't need to occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by eliminating the buffer copy. Added an internal PEHeader(byte[] buffer, int offset, int length) constructor that references PEBuffer's existing byte array without copying. PEFile now uses PEBuffer.GetBufferInfo() to get the buffer parameters and constructs PEHeader with zero allocation overhead. The public PEHeader(ReadOnlySpan<byte>) constructor remains available for external callers that provide their own buffers. (commit fbbf797)
Co-authored-by: brianrob <[email protected]>
brianrob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot please address these comments.
|
|
||
| // Internal method to get buffer parameters for zero-copy PEHeader construction | ||
| internal void GetBufferInfo(int filePos, int size, out byte[] buffer, out int offset, out int length) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this to EnsureRead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to EnsureRead in commit ec49a3a.
| } | ||
| return &m_buffPtr[filePos - m_buffPos]; | ||
| } | ||
| public ReadOnlySpan<byte> FetchSpan(int filePos, int size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename PEBuffer to PEBufferedReader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to PEBufferedReader in commit ec49a3a.
|
|
||
| // Internal method to get buffer parameters for zero-copy PEHeader construction | ||
| internal void GetBufferInfo(int filePos, int size, out byte[] buffer, out int offset, out int length) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than returning 3 out parameters, please return a struct called PEBufferedSlice that contains properties for Buffer, Offset and Length. Please also add a method to the struct called AsSpan() that returns a ReadOnlySpan<byte> representing the slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created PEBufferedSlice struct with Buffer, Offset, Length properties and AsSpan() method in commit ec49a3a.
| /// <summary> | ||
| /// Returns a PEHeader for ReadOnlySpan of bytes in memory. Validates buffer bounds. | ||
| /// </summary> | ||
| public PEHeader(ReadOnlySpan<byte> peFileData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constructor is unused. Please remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the unused constructor in commit ec49a3a.
| /// <summary> | ||
| /// Returns a PEHeader that references an existing buffer without copying. Validates buffer bounds. | ||
| /// </summary> | ||
| internal PEHeader(byte[] buffer, int offset, int length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should take the new struct as a parameter. Then, just save the struct instead of m_buffer, m_bufferOffset, and m_bufferLength.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to take PEBufferedSlice and store it directly in commit ec49a3a.
| } | ||
|
|
||
| IMAGE_DOS_HEADER dosHdr; | ||
| fixed (byte* bufferPtr = m_buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to call AsSpan() here and then operate with the ReadOnlySpan instead of using unsafe here. That will ensure that we don't read off the end of the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use AsSpan() and MemoryMarshal.Read instead of unsafe pointer access in commit ec49a3a.
…PEBufferedReader, use AsSpan Co-authored-by: brianrob <[email protected]>
All feedback addressed in commit ec49a3a. Created PEBufferedSlice struct, renamed to PEBufferedReader, renamed method to EnsureRead, removed unused constructor, and updated to use AsSpan() with MemoryMarshal.Read. |
| throw new InvalidOperationException("GetDirectory only available with span-based PEHeader."); | ||
| } | ||
|
|
||
| private IMAGE_DOS_HEADER* dosHeader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot, we shouldn't have two different paths with different fields that only work based on the constructor that was used. Instead, please move everything to be span based and remove the constructor that takes a raw pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed all dual-path logic in commit 7f1edfe. Eliminated the pointer-based constructor and all pointer fields (dosHeader, ntHeader, sections). Everything now uses the span-based approach exclusively with PEBufferedSlice and MemoryMarshal for safe, bounds-checked access.
…out PEHeader Co-authored-by: brianrob <[email protected]>
|
@copilot, please add some tests to TraceEvent.Tests to validate that the changes in the PEFile class work. |
Co-authored-by: brianrob <[email protected]>
Co-authored-by: brianrob <[email protected]>
Created test applications in
Both build and run successfully. The tester demonstrates:
Run with: cd src/TestApps/LargePEHeaderTest
dotnet run --project LargePEHeaderGenerator.csproj
cd Tester && dotnet run --project Tester.csproj ../LargeHeaderTest.exe(commit fb99913) |
|
@copilot I've just pushed a commit that fixes TestBothImplementations.csproj to run both implementations (new and old). With the generated large exe, both of them succeed. Please fix the large exe so that it fails to load because it fails the check in OldPEFile.cs line 365. You can do this by making sure that imageHeaderOffset > 512. |
…tation failure Co-authored-by: brianrob <[email protected]>
Fixed in commit a04994b. The generator now creates a PE file with:
The test output shows:
Run with: |
marklio
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty reasonable. While this looks correct, I'm curious why you didn't just move to the built-in PE reader (or Marklio.Metadata ;) )
| return ref MemoryMarshal.Cast<byte, IMAGE_SECTION_HEADER>(span)[0]; | ||
| } | ||
|
|
||
| private ref readonly IMAGE_OPTIONAL_HEADER32 OptionalHeader32Span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have a Span suffix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an artifact of an earlier iteration of the change when copilot was trying to build this class to have two constructors - one that took a pointer and one that took a span. I've removed the suffix.
I did not know about either of these. :) There is also a benefit to having fewer dependencies because PerfView them must carry them all in order to support its "autoextract" capability. |
Full disclosure, I hate our built-in PE reader :), which is why I continue to use and support Marklio.Metadata. :) I did a big span conversion a few years ago, so alot of this is very famliar. It supports memory-mapped files, streams, and lots of different "I have a bunch of bytes that are pieces of binaries" scenarios. |
Overview
This PR completely refactors
PEFileandPEHeaderto useReadOnlySpan<byte>exclusively instead of raw unsafe pointers, providing automatic bounds checking to prevent reading outside allocated buffers. The implementation uses zero-copy buffer sharing for optimal performance and eliminates all dual-path logic for a cleaner, more maintainable codebase.Motivation
The existing implementation uses unsafe pointers (
byte*,void*) to read PE file headers, which has several risks:Using
ReadOnlySpan<byte>provides:Key Design Pattern - Progressive Reads
m_sectionsOffsetfor use byPEHeaderSizepropertyHeader.PEHeaderSize > 1024and re-reads with correct size if neededSafety Guarantees
Performance
Compatibility
Testing
Comprehensive Test Suite
Added 10 comprehensive tests in
src/TraceEvent/TraceEvent.Tests/Utilities/PEFileTests.cs:All tests pass (9/10 on Linux, all 10 on Windows)
Test Applications - Demonstrating the Improvement
Added standalone test applications in
src/TestApps/LargePEHeaderTest/that clearly demonstrate the limitations of the old implementation:Generated PE File Characteristics
Test Results
Running
TestBothImplementations.csproj:Old Implementation (OldPEFile.cs):
Fails the check:
if (!(sizeof(IMAGE_DOS_HEADER) <= imageHeaderOffset && imageHeaderOffset <= 512))New Implementation (PEFile with ReadOnlySpan):
Running the Tests
The test applications clearly demonstrate that:
imageHeaderOffset > 512bytesImplementation Details
PEBufferedReader (renamed from PEBuffer)
FetchSpan(int filePos, int size)returningReadOnlySpan<byte>EnsureRead(int filePos, int size)returningPEBufferedSlicestruct for zero-copy constructionFetch()method returningbyte*for backward compatibilityPEBufferedSlice (new struct)
AsSpan()method returningReadOnlySpan<byte>PEHeader (fully span-based)
PEHeader(void*)PEHeader(PEBufferedSlice slice)for zero-allocation performanceMemoryMarshal.Read<T>()andMemoryMarshal.Cast<T>()exclusively for safe structure accessGetBufferSpan(int offset, int length)helper for bounds-checked span slicingPEFile (public API unchanged)
EnsureRead()and zero-copyPEHeaderconstructorOriginal prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.