Skip to content

JIT emits unnecessary movsxd instructions when calling into Span indexer #12218

@GrabYourPitchforks

Description

@GrabYourPitchforks

When passing a non-constant value into the Span<T> and ReadOnlySpan<T> indexer, the JIT will emit an unnecessary movsxd instruction on x64. The repro is fairly simple:

for (int i = 0; i < ints.Length; i++)
{
    retVal += ints[i];
}

Current codegen:

00007ffd`2a2e7291 85c9            test    ecx,ecx
00007ffd`2a2e7293 7e0f            jle     <AFTER_LOOP>
00007ffd`2a2e7295 4d63d1          movsxd  r10,r9d
00007ffd`2a2e7298 42030492        add     eax,dword ptr [rdx+r10*4]
00007ffd`2a2e729c 41ffc1          inc     r9d
00007ffd`2a2e729f 443bc9          cmp     r9d,ecx
00007ffd`2a2e72a2 7cf1            jl      00007ffd`2a2e7295

I prototyped the below change in my local branch by modifying the logic in importer.cpp to use zero-extension instead of signed-extension for the span indexer and ran a benchmark. The modified code took approximately one-third less time to run. This optimization may be worth investigating if we believe that developers are iterating over spans in hot loops. (Admittedly, any more complex logic within the loop would almost certainly overwhelm these benchmark results.)

            // Element access
            GenTree*             indexIntPtr = gtNewCastNode(TYP_U_IMPL, indexClone, true /* fromUnsigned */, TYP_U_IMPL);   // <-- modified line
            GenTree*             sizeofNode  = gtNewIconNode(elemSize);
            GenTree*             mulNode     = gtNewOperNode(GT_MUL, TYP_U_IMPL, indexIntPtr, sizeofNode);   // <-- modified line
Method Toolchain SpanLength Mean Error StdDev Ratio RatioSD
SumInts baseline 48 2,921.32 us 53.786 us 44.914 us 1.00 0.00
SumInts modified 48 1,964.96 us 38.825 us 43.154 us 0.67 0.02
SumInts baseline 512 35,429.46 us 574.800 us 537.669 us 1.00 0.00
SumInts modified 512 23,219.06 us 457.335 us 698.398 us 0.67 0.02
SumInts baseline 2048 139,664.62 us 1,799.241 us 1,683.011 us 1.00 0.00
SumInts modified 2048 93,175.18 us 1,838.916 us 3,586.665 us 0.66 0.04

/cc @dotnet/jit-contrib

category:cq
theme:basic-cq
skill-level:expert
cost:medium
impact:medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsoptimization

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions