Skip to content

Conversation

@tannergooding
Copy link
Member

This resolves #122272

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 9, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@BoyBaykiller
Copy link

This existing transform to remove double negation doesn't seem to apply here.

// Remove double negation/not.
if (op1->OperIs(oper))
{
JITDUMP("Remove double negation/not\n")
GenTree* op1op1 = op1->gtGetOp1();
DEBUG_DESTROY_NODE(tree);
DEBUG_DESTROY_NODE(op1);
return op1op1;
}

Probably some ordering issue?

double NegateMulTest(double num)
{
    num *= -1;
    num *= -1;
    num *= -1;

    return num;
}
G_M43396_IG02:  ;; offset=0x0000
       vxorps   xmm0, xmm0, xmmword ptr [reloc @RWD00]
       vxorps   xmm0, xmm0, xmmword ptr [reloc @RWD00]
       vxorps   xmm0, xmm0, xmmword ptr [reloc @RWD00]

Meanwhile:

double NegateMulOld(double num)
{
    num = -num;
    num = -num;
    num = -num;

    return num;
}
G_M22709_IG02:  ;; offset=0x0000
       vxorps   xmm0, xmm0, xmmword ptr [reloc @RWD00]

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements additional floating-point and SIMD morph optimizations by moving and extending transformation logic. The changes include optimizations for division-to-multiplication conversion (when divisor is a power of 2), multiplication by -1 (converting to negation), and multiplication by 2 (converting to addition).

Key changes:

  • Relocated the scalar floating-point division-to-multiplication optimization from early in GT_DIV case to after GT_NEG transformation, ensuring it applies the reciprocal optimization correctly after sign transformations
  • Added new scalar floating-point optimizations for multiplication by -1 and 2 in fgOptimizeMultiply
  • Extended SIMD GT_DIV and GT_MUL cases with similar optimizations including division-to-multiplication for power-of-2 divisors, multiplication by -1 to negation, and multiplication by 2 to addition
Comments suppressed due to low confidence (1)

src/coreclr/jit/morph.cpp:10561

  • The DEBUG_DESTROY_NODE is missing for the original mul node. This case creates a new add node and returns it but doesn't destroy the original multiplication node, similar to the issue on line 9846. For consistency with line 10535 which properly destroys mul when returning op1, this case should also include DEBUG_DESTROY_NODE(mul) before returning add.
            DEBUG_DESTROY_NODE(op2);
            op2          = fgMakeMultiUse(&op1);
            GenTree* add = gtNewOperNode(GT_ADD, mul->TypeGet(), op1, op2);
            add->SetMorphed(this, /* doChildren */ true);
            return add;

@tannergooding
Copy link
Member Author

tannergooding commented Dec 11, 2025

This existing transform to remove double negation doesn't seem to apply here.

Yes, there's a lack of attempting to "re-morph" here once the new operation is recognized. I added in a "hack" specifically to handle this case. Ideally there's a larger refactoring of fgMorphSmpOp to make this easier to reuse, much as exists for fgMorphHWIntrinsic

(re-morph is in quotes because we don't actually try to morph twice and there's various transforms we don't want to repeat, but there is some post-order processing we can defer back to after the oper type changes if we have the right helper methods).

@tannergooding
Copy link
Member Author

tannergooding commented Dec 11, 2025

CC. @dotnet/jit-contrib, @EgorBo

No TP regression and some minor positive diffs: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1227641&view=ms.vss-build-web.run-extensions-tab

Few simplifications for each case added, such as negating instead of multiplying or adding instead of multiplying by two.

There does look to be a missing AVX-512 containment opportunity around vxorps xmm0, xmm1, xmmword ptr [reloc] which could be vxorps xmm0, xmm1, dword ptr [reloc] {1to4} instead, but that's unrelated to this PR

There are also a couple small size regressions on x64 due to different register selection changing codegen output minimally or causing an extra prologue spill and epilogue restore. There's also one case where some code is doing division and multiplication by the same power of 2 value and so converting to multiplication by a reciprocal causes an extra constant to exist and be used. Handling this is probably difficult without some kind of interplay with CSE/VN to see number of uses of a given value (so a decision can be made on whether its more expensive to do a second load from L1 or to spend extra cycles doing division). -- With the cases in the diff all still being profitable ones, so the PR looks good overall

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of nits, but fine as is.

@tannergooding tannergooding merged commit 4d88303 into dotnet:main Jan 7, 2026
123 checks passed
@tannergooding tannergooding deleted the fix-122272 branch January 7, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[JIT] Reduce FP multiplication by -1 to plain negation

3 participants