ElementLocation optimisations #10029

drewnoakes · 2024-04-18T00:52:40Z

Null annotate file
Remove some redundant code
Consolidate validation
Reduce branching during construction
Documentation updates
Generally reduce the number of diagnostics displayed in the IDE in this file

For Int32, GetHashCode just returns the value directly.

The SmallElementLocation class exists because very few element locations require 32 bits to store the line/column values. It uses ushort (2 bytes) instead of int (4 bytes) for each value, in an attempt to reduce memory usage. However the CLR aligns ushort fields on classes at four-byte boundaries on most (all?) architectures, meaning the optimisation has no effect. This commit explicitly packs the two values into four bytes to ensure that four bytes is saved per instance.

The caller performs this validation already, and no other code can call this. Avoid some indirection and branching.

The compiler will generate slightly better code from this switch statement, in cases where either line or column is zero.

There was inconsistent handling of validation between implementations. This moves it all into the `Create` method so that it can be handled in one place, consistently.

ladipro · 2024-04-18T09:20:04Z

However the CLR aligns ushort fields on classes at four-byte boundaries on most (all?) architectures, meaning the optimisation has no effect.

This is surprising. The alignment requirement of primitives tends to be their size, so two-byte integers will typically be aligned at two-byte boundaries. Inspecting the x64 NetFx MSBuild with SOS supports it:

On which architecture did you see them unnecessarily padded?

drewnoakes · 2024-04-18T09:56:42Z

Yes I was also surprised. I tested here, which is x64. Maybe the approach taken in that code is incorrect.

drewnoakes · 2024-04-18T10:03:08Z

Here's a diff between field positions that shows four bytes offset between consecutive ushort fields.

drewnoakes · 2024-04-18T10:16:49Z

Inspecting the x64 NetFx MSBuild with SOS supports it:

The highlighted bit of your screenshot shows the string name field is two bytes, which looks suspicious to me.

ladipro · 2024-04-18T10:20:22Z

I believe the test gives misleading results because classes are always aligned at a pointer-size boundary. A class with two 2-byte integers will be padded with 4 extra bytes, making it use the same amount of space as a class with two 4-byte integers. If I tweak the test to use four fields instead of two, the sizes come out different.

ladipro · 2024-04-18T10:21:07Z

Inspecting the x64 NetFx MSBuild with SOS supports it:

The highlighted bit of your screenshot shows the string name field is two bytes, which looks suspicious to me.

I believe this output uses hex numbers.

drewnoakes · 2024-04-18T10:22:45Z

Here's a diff between field positions that shows four bytes offset between consecutive ushort fields.

Oh I used the wrong type here. Changing it to C2 works correctly.

drewnoakes · 2024-04-18T10:25:50Z

Thank you for double checking. I've closed the original issue.

There are some changes here to reduce CPU as well which you might consider, given these seem to be highly used types/methods. I'll update the PR tomorrow. Let me know if any of the other changes seem problematic, and I'll back them out too.

danmoseley · 2024-04-18T20:49:35Z

(Sometimes felt like the MSBuild engine is basically string manipulation, file IO, map lookups, and caches...) These could certainly be cached, but then you have interning and cache invalidation to deal with, which has been challenging in several places, such as the XML cache and the original string interning cache.

drewnoakes · 2024-04-18T23:47:59Z

Pulling the 8-byte string reference out makes some sense. I wondered if we could find a way to remove it completely, if the locations are always used within the context of a document, or an element that can reach a document.

Converting to a struct is also possible, though we'd need to allocate the maximum size for line/column, as the small/regular approach used here relies on polymorphism which doesn't work with structs.

In both cases, I assumed we couldn't really consider them as possibilities given this is public API. If there's some leeway there then we can revisit.

The CLR does in fact pack these fields adjacent to one another, so we don't have to do this in code ourselves.

ladipro · 2024-04-19T07:29:02Z

Ah, that's true, they're public. Definitely limits the options but "compressing" the file reference should still be possible.

drewnoakes · 2024-04-19T09:00:08Z

If the string cache is a global singleton then yes it'd be straightforward. Would we consider such a cache? I think it'd be per-process, grow-only, with no eviction. The assumption being that there's a limited number of files involved in build operations.

If there's interest, I'll push an update to explore the idea.

ladipro · 2024-04-19T09:09:38Z

I think it depends on the impact. I would probably first measure how much we can save relative to all MSBuild allocations.

drewnoakes · 2024-04-19T10:20:54Z

dotnet new console temp
msbuild /getProperty:MyProperty temp.csproj

For evaluation of an empty console app, it looks like SmallElementLocation is just over 2% of allocated bytes.

We believe SmallElementLocation objects are currently 24 bytes each:

8 header
8 string reference
4 line
4 column

Instead, we could have 16 bytes to cover almost everything:

Small
- 8 header
- 2 bytes index
- 2 column
- 4 line

So roughly an 0.7% reduction for evalution. But please check my numbers!

ladipro · 2024-04-19T12:08:28Z

Thank you. The overhead of being an object is 16 bytes on 64-bit so I think it would go from 32 to 24 bytes actually taken up on the GC heap but I guess the delta is the important number and that's still -8 bytes.

For evaluation, I suspect some of the allocations seen during command-line execution of MSBuild.exe are one-time initialization that wouldn't happen again on subsequent evaluations. So the relative reduction in e.g. VS scenarios would probably be slightly more. I'd say it's worth it - assuming that the implementation does not add too much complexity and doesn't slow down construction of these objects.

Adds new subtypes for `ElementLocation` that pack to multiples of eight bytes, to avoid wasting space at runtime on padding between instances of this class in memory. The primary gain here comes from being able to use a smaller value for the `File` value. With this change, there's a lock-free cache of file paths which are then stored by index. When the index is small, as it usually will be, it can be packed for efficiently (e.g. in 2 bytes) than a string reference (8 bytes on 64-bit architectures). See code comment for more details. Also remove file IO from unit tests so they run faster.

ladipro · 2024-04-23T06:08:27Z

src/Build/ElementLocation/ElementLocation.cs

-            // Line and column are good enough
-            return Line.GetHashCode() ^ Column.GetHashCode();
+            // We don't include the file path in the hash
+            return (Line * 397) ^ Column;


nit: This should be unchecked for perf and in very extreme cases also for correctness.

ladipro · 2024-04-23T06:28:38Z

src/Build/ElementLocation/ElementLocation.cs

+            // When combinedValue is negative, it implies that either line or column were negative.
+            ErrorUtilities.VerifyThrow(combinedValue > -1, "Use zero for unknown");
+
+            // TODO store the last run's value and check if this is for the same file. If so, skip the dictionary lookup (tree walk).


I wonder if storing the last file in a thread-static variable wouldn't amortize the cost of full lookup/add enough that we could use a conventional data structure with a simple lock.

JanKrivanek · 2025-01-14T11:59:54Z

Relevant for #11160 - as this has potential to improve incremental eval perf

Copilot

Pull Request Overview

This PR optimizes the ElementLocation class by implementing various memory and performance improvements. The changes focus on reducing object sizes, consolidating validation logic, and improving code maintainability through null annotations and better documentation.

Introduces multiple specialized ElementLocation implementations with optimized memory layouts based on value ranges
Consolidates validation logic and reduces branching during object construction
Adds null annotations and removes redundant code throughout the files

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
src/Build/ElementLocation/XmlDocumentWithLocation.cs	Adds new constructor overload for unit testing with explicit file path
src/Build/ElementLocation/ElementLocation.cs	Major refactoring with new memory-optimized storage classes and file index caching system
src/Build.UnitTests/Construction/ElementLocation_Tests.cs	Updates unit tests to support new ElementLocation implementations and adds comprehensive test coverage

Copilot · 2025-07-30T15:44:35Z

src/Build/ElementLocation/ElementLocation.cs

+                        while (array[array.Length - 1] is null)
+                        {
+                            Thread.SpinWait(100);
+                        }


This busy-wait loop with Thread.SpinWait(100) could lead to high CPU usage. Consider using a more efficient synchronization mechanism like SpinWait.SpinUntil() or a proper wait handle.

Copilot · 2025-07-30T15:44:35Z

src/Build/ElementLocation/ElementLocation.cs

+                        }
+
+                        // Otherwise, loop around again. We can't just return exchanged here,
+                        // as theoretically the array might have been grown more than once.


The infinite while loop without any backoff strategy could lead to excessive CPU usage during contention. Consider adding a yield or exponential backoff to reduce spinning overhead.

Suggested change

// as theoretically the array might have been grown more than once.

// as theoretically the array might have been grown more than once.

Thread.Sleep(backoff);

backoff = Math.Min(backoff * 2, 100); // Exponential backoff with a cap

Copilot · 2025-07-30T15:44:35Z

src/Build/ElementLocation/ElementLocation.cs

+                // Data race! Spin.
+                array = Volatile.Read(ref s_fileByIndex);
            }
+
+            return array[index];
        }



Another busy-wait loop that could consume excessive CPU. The comment acknowledges this is a data race, but the mitigation strategy is inefficient. Consider using proper synchronization or at least adding a yield between iterations.

src/Build/ElementLocation/ElementLocation.cs

Copilot · 2025-07-30T15:44:36Z

src/Build/ElementLocation/ElementLocation.cs

+            // TODO store the last run's value and check if this is for the same file. If so, skip the dictionary lookup (tree walk).
+            int fileIndex = GetOrAddFileIndex(filePath);
+


[nitpick] This TODO comment indicates a known performance optimization opportunity. The current implementation performs dictionary lookups for every file access, which could be expensive for frequently accessed files.

Suggested change

// TODO store the last run's value and check if this is for the same file. If so, skip the dictionary lookup (tree walk).

int fileIndex = GetOrAddFileIndex(filePath);

// Cache the last accessed file path and its index to optimize repeated accesses.

if (_lastFilePath == filePath)

{

return _lastFileIndex;

}

int fileIndex = GetOrAddFileIndex(filePath);

// Update the cache with the current file path and index.

_lastFilePath = filePath;

_lastFileIndex = fileIndex;

SimaTian

I'm comfortable with the packing section - the logic there is clear and understandable.
I'm uncertain about the synchronization part - currently I don't see which parts and why need to be synchronized and the implications of thereof.

AR-May

We looked at the PR together with @drewnoakes and left some comments for future investigations/fixes.

I created exp/ and perf/ branches for this PR, let's check the performance impact in the current state.

AR-May · 2025-09-25T09:54:56Z

src/Build/ElementLocation/ElementLocation.cs

            {
-                locationString = file;
+                // At least one value needs int.
+                if (fileIndex <= short.MaxValue && column <= short.MaxValue)


short -> ushort?

AR-May · 2025-09-25T09:55:43Z

src/Build/ElementLocation/ElementLocation.cs

+            // FullElementLocation          24        8 (max 2b)      8 (max 2b)      8 (max 2b)
+
+            // Check for empty first
+            if (fileIndex is 0 && line is 0 && column is 0)


Could be potentially redundant condition, as there is similar one above. We should check for that.

If it is not the combinedValue could be combined with file index earlier, so that we can have one check for zero.

AR-May · 2025-09-25T10:03:30Z

src/Build/ElementLocation/ElementLocation.cs

-                this.line = line;
-                this.column = column;
-            }
+                    _ = ImmutableInterlocked.TryAdd(ref s_indexByFile, file, index);


Need to think again about the case when addition is not successful. If addition is not successful, we should lookup and return correct index of the element.

AR-May · 2025-09-25T10:15:06Z

src/Build/ElementLocation/ElementLocation.cs

+            while (index >= array.Length)
            {
-                get { return column; }
+                // Data race! Spin.
+                array = Volatile.Read(ref s_fileByIndex);
            }


The only way we could have an index here is for that index to have been committed to the array, so we don`t need to worry about a data race here. Any bounds check would be a program error, not a data race.

AR-May · 2025-09-25T10:18:18Z

src/Build/ElementLocation/ElementLocation.cs

+
+                        // Need to grow the array
+
+                        // Wait for the last value to be non-null, so that we have all values to copy


There's a race here. It's not sufficient to assume that the writes happen in order. Even if the last element is written, some of the previous entries might not be updated yet. We should instead have an atomically updated count of how many we've written and check that.

AR-May · 2025-09-25T10:24:40Z

src/Build/ElementLocation/ElementLocation.cs

            return Create(file, 0, 0);
        }

+        private static string[] s_fileByIndex = new string[32];


We may consider increasing number to 512.

AR-May · 2025-09-25T10:26:57Z

src/Build/ElementLocation/ElementLocation.cs

+            // than that threshold.
+
+            // Handle cases that fit in 0xFF and 0XFFFF.
+            if (combinedValue <= byte.MaxValue)


We should collect some stats concerning how much element location class is used. This will help us to figure out which conditions should go first in the code below.

drewnoakes added 5 commits April 18, 2024 09:42

Remove redundant GetHashCode calls

776ec7a

For Int32, GetHashCode just returns the value directly.

Remove redundant null check

37def11

Update comment

cb6ec14

Null annotate ElementLocation

093bee7

drewnoakes added Area: Performance Performance-Scenario-General labels Apr 18, 2024

drewnoakes added 13 commits April 18, 2024 11:04

Remove redundant validation

bb30e4f

The caller performs this validation already, and no other code can call this. Avoid some indirection and branching.

Simplify LocationString construction

57e0d5b

The compiler will generate slightly better code from this switch statement, in cases where either line or column is zero.

Consolidate validation

80c1ea2

There was inconsistent handling of validation between implementations. This moves it all into the `Create` method so that it can be handled in one place, consistently.

Simplify names (IDE0001)

3ea9a86

Use auto properties

075ce08

Use inheritdoc to avoid duplication

2f1c078

Make field readonly

28cfb19

Use constants

2acc628

Reduce branching when testing line/column values

596c574

Use pattern matching

d461522

Use standard API doc prefix

5ef2a4b

Use primary constructor

2b81d60

Seal private nested classes

fc82d47

drewnoakes force-pushed the element-location-perf branch from 92e2897 to fc82d47 Compare April 18, 2024 02:20

drewnoakes marked this pull request as draft April 18, 2024 10:25

drewnoakes added 2 commits April 19, 2024 09:49

Improve hash function

be277f6

Revert field packing

2653348

The CLR does in fact pack these fields adjacent to one another, so we don't have to do this in code ourselves.

drewnoakes marked this pull request as ready for review April 19, 2024 05:07

Inline field

efb2f9a

drewnoakes added 8 commits April 22, 2024 12:31

Reset the file index before running ElementLocation tests

e5ab818

Simplify test a bit

c47008b

Add test that shows file index packing

be049a8

Add comment

3fb49a8

More info in assertion message

c3de378

Fix assertion

a3af7bf

Update test code

eeed871

ladipro reviewed Apr 23, 2024

View reviewed changes

rainersigwald added this to the VS 17.13 milestone Sep 11, 2024

maridematte removed the Performance-Scenario-General label Jun 2, 2025

ghost assigned AR-May Jul 7, 2025

Merge branch 'main' into element-location-perf

52cad1e

Copilot AI review requested due to automatic review settings July 30, 2025 15:43

Copilot AI reviewed Jul 30, 2025

View reviewed changes

SimaTian reviewed Jul 31, 2025

View reviewed changes

Merge branch 'main' into element-location-perf

a37dcd4

AR-May reviewed Sep 25, 2025

View reviewed changes

		// TODO store the last run's value and check if this is for the same file. If so, skip the dictionary lookup (tree walk).
		int fileIndex = GetOrAddFileIndex(filePath);

-            // TODO store the last run's value and check if this is for the same file. If so, skip the dictionary lookup (tree walk).
-            int fileIndex = GetOrAddFileIndex(filePath);
+            // Cache the last accessed file path and its index to optimize repeated accesses.
+            if (_lastFilePath == filePath)
+            {
+                return _lastFileIndex;
+            }
+            int fileIndex = GetOrAddFileIndex(filePath);
+            // Update the cache with the current file path and index.
+            _lastFilePath = filePath;
+            _lastFileIndex = fileIndex;


		// Need to grow the array

		// Wait for the last value to be non-null, so that we have all values to copy

ElementLocation optimisations #10029

Are you sure you want to change the base?

ElementLocation optimisations #10029

Uh oh!

Conversation

drewnoakes commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ladipro commented Apr 18, 2024

Uh oh!

drewnoakes commented Apr 18, 2024

Uh oh!

drewnoakes commented Apr 18, 2024

Uh oh!

drewnoakes commented Apr 18, 2024

Uh oh!

ladipro commented Apr 18, 2024

Uh oh!

ladipro commented Apr 18, 2024

Uh oh!

drewnoakes commented Apr 18, 2024

Uh oh!

drewnoakes commented Apr 18, 2024

Uh oh!

danmoseley commented Apr 18, 2024

Uh oh!

drewnoakes commented Apr 18, 2024

Uh oh!

ladipro commented Apr 19, 2024

Uh oh!

drewnoakes commented Apr 19, 2024

Uh oh!

ladipro commented Apr 19, 2024

Uh oh!

drewnoakes commented Apr 19, 2024

Uh oh!

ladipro commented Apr 19, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JanKrivanek commented Jan 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

SimaTian left a comment

Choose a reason for hiding this comment

Uh oh!

AR-May left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drewnoakes commented Apr 18, 2024 •

edited

Loading