Clean up some of the math intrinsic logic and support constant folding #47321

tannergooding · 2021-01-22T05:35:14Z

This cleans up the NI_System_Math_* related logic to make it easier to determine what entries were being missed by alphabetizing the IDs and adding an assert for the range.

As part of that, this ensures all the existing public Math "intrinsic" APIs (minus FusedMultiplyAdd and SinCos, which have special requirements) are going through CSE and enables constant folding for them when IsReadyToRun is false.
This was done under the premise that, for a given run these APIs are deterministic and it is only harmful to constant fold in AOT like scenarios where the CRT used to compile and that which is being run against might differ.

The PMI diffs for --frameworks shows some minimal gains:

Summary of Code Size diffs: (Lower is better) Total bytes of base: 51165736 Total bytes of diff: 51165724 Total bytes of delta: -12 (-0.00% of base) diff is an improvement. Top file regressions (bytes): 12 : System.Runtime.Numerics.dasm (0.02% of base) Top file improvements (bytes): -21 : System.Speech.dasm (-0.00% of base) -3 : Microsoft.VisualBasic.Core.dasm (-0.00% of base) 3 total files with Code Size differences (2 improved, 1 regressed), 267 unchanged. Top method regressions (bytes): 12 ( 1.39% of base) : System.Runtime.Numerics.dasm - Complex:Asin_Internal(double,double,byref,byref,byref) Top method improvements (bytes): -21 (-17.36% of base) : System.Speech.dasm - ConvertTextFrag:ScaleNumber(float,int,int):int -3 (-0.55% of base) : Microsoft.VisualBasic.Core.dasm - Financial:NPer(double,double,double,double,int):double Top method regressions (percentages): 12 ( 1.39% of base) : System.Runtime.Numerics.dasm - Complex:Asin_Internal(double,double,byref,byref,byref) Top method improvements (percentages): -21 (-17.36% of base) : System.Speech.dasm - ConvertTextFrag:ScaleNumber(float,int,int):int -3 (-0.55% of base) : Microsoft.VisualBasic.Core.dasm - Financial:NPer(double,double,double,double,int):double 3 total methods with Code Size differences (2 improved, 1 regressed), 263270 unchanged.

The win in System.Speech is due to a call being inlined:

- vmovsd qword ptr [rsp+20H], xmm0 - vmovsd xmm0, qword ptr [reloc @RWD08] - call Math:Log(double):double - vmovsd xmm1, qword ptr [rsp+20H] - vdivsd xmm0, xmm1, xmm0 + vdivsd xmm0, xmm0, qword ptr [reloc @RWD08]

The regression in System.Runtime.Numerics looks to be because Log was not previously CSE'd and so now there is a new temp introduced:

- vmovsd xmm6, qword ptr [rax] + vmovsd xmm1, qword ptr [rax] + vmovsd qword ptr [rsp+38H], xmm1 call Math:Log(double):double - vaddsd xmm6, xmm0, xmm6 + vmovsd xmm1, qword ptr [rsp+38H] + vaddsd xmm6, xmm1, xmm0

It isn't exactly clear why this is needed for this scenario, particularly when the value isn't reused again elsewhere (the managed code is: https://source.dot.net/#System.Runtime.Numerics/System/Numerics/Complex.cs,594)

We see larger gains in --benchmarks, this time constant folding methods like Exp due to values that are constant post inlining:

Total bytes of base: 505742 Total bytes of diff: 505475 Total bytes of delta: -267 (-0.05% of base) diff is an improvement. Top file improvements (bytes): -88 : Benchstones\BenchF\Trap\Trap\Trap.dasm (-10.73% of base) -59 : Benchstones\BenchF\Simpsn\Simpsn\Simpsn.dasm (-7.49% of base) -46 : Benchstones\BenchF\NewtE\NewtE\NewtE.dasm (-3.61% of base) -40 : Benchstones\BenchF\Romber\Romber\Romber.dasm (-1.83% of base) -19 : Benchstones\BenchF\BenchMrk\BenchMrk\BenchMrk.dasm (-2.01% of base) -15 : Benchstones\BenchF\BenchMk2\BenchMk2\BenchMk2.dasm (-1.60% of base) 6 total files with Code Size differences (6 improved, 0 regressed), 76 unchanged. Top method improvements (bytes): -88 (-34.92% of base) : Benchstones\BenchF\Trap\Trap\Trap.dasm - Trap:Bench():bool -59 (-21.61% of base) : Benchstones\BenchF\Simpsn\Simpsn\Simpsn.dasm - Simpsn:Bench():bool -40 (-4.44% of base) : Benchstones\BenchF\Romber\Romber\Romber.dasm - Romber:Bench():bool -25 (-43.86% of base) : Benchstones\BenchF\NewtE\NewtE\NewtE.dasm - NewtE:FX(double,double):double -21 (-24.42% of base) : Benchstones\BenchF\NewtE\NewtE\NewtE.dasm - NewtE:F(double,double):double -19 (-4.19% of base) : Benchstones\BenchF\BenchMrk\BenchMrk\BenchMrk.dasm - BenchMrk:Bench():bool -15 (-3.36% of base) : Benchstones\BenchF\BenchMk2\BenchMk2\BenchMk2.dasm - BenchMk2:Bench():bool Top method improvements (percentages): -25 (-43.86% of base) : Benchstones\BenchF\NewtE\NewtE\NewtE.dasm - NewtE:FX(double,double):double -88 (-34.92% of base) : Benchstones\BenchF\Trap\Trap\Trap.dasm - Trap:Bench():bool -21 (-24.42% of base) : Benchstones\BenchF\NewtE\NewtE\NewtE.dasm - NewtE:F(double,double):double -59 (-21.61% of base) : Benchstones\BenchF\Simpsn\Simpsn\Simpsn.dasm - Simpsn:Bench():bool -40 (-4.44% of base) : Benchstones\BenchF\Romber\Romber\Romber.dasm - Romber:Bench():bool -19 (-4.19% of base) : Benchstones\BenchF\BenchMrk\BenchMrk\BenchMrk.dasm - BenchMrk:Bench():bool -15 (-3.36% of base) : Benchstones\BenchF\BenchMk2\BenchMk2\BenchMk2.dasm - BenchMk2:Bench():bool

tannergooding · 2021-01-22T05:37:56Z

I broke this up by commit to make it easier to review.

No diffs for --tests

Total bytes of base: 139700628 Total bytes of diff: 139700628 Total bytes of delta: 0 (0.00% of base) 0 total files with Code Size differences (0 improved, 0 regressed), 3512 unchanged. 0 total methods with Code Size differences (0 improved, 0 regressed), 478331 unchanged.

Local tests for scenarios like Matrix4x4.CreateRotationX(MathF.PI / 2); // 90 degrees shows that it can now be fully optimized when inlined.

tannergooding · 2021-01-22T05:40:51Z

src/coreclr/jit/valuenum.cpp

It would be "nice" if these could call COMDouble::Acos, etc. However I think that's difficult due to layering and it wasn't obvious what would be needed to make that happen.

tannergooding · 2021-01-22T18:10:10Z

CC. @jkotas, @briansull, @dotnet/jit-contrib

src/coreclr/jit/importer.cpp

…Single and COMDouble types

…ed on IsTargetIntrinsic

…needed

…and SinCos

…is false

tannergooding · 2021-01-28T16:45:36Z

Rebased to pick up unrelated test fixes so CI will pass again

jkotas · 2021-01-28T16:52:03Z

The non-JIT part of change looks good to me.

AndyAyersMS · 2021-01-28T17:04:11Z

@briansull can you take a look at the jit changes?

briansull

I reviewed the JIT changes, Looks good

src/coreclr/jit/codegenxarch.cpp

ghost · 2021-01-29T19:24:35Z

Hello @tannergooding!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (`@msftbot`) and give me an instruction to get started! Learn more here.

tannergooding · 2021-01-29T19:24:41Z

Thanks for the reviews!

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 22, 2021

tannergooding commented Jan 22, 2021

View reviewed changes

tannergooding force-pushed the math branch from c9de813 to 3cb2d04 Compare January 22, 2021 05:55

This was referenced Jan 22, 2021

Safe handle use after close in System.Net.Security.Tests #46770

Closed

ClientAndServer_OneOrBothUseDefault_Ok - fails with NullReferenceException #2353

Closed

danmoseley reviewed Jan 23, 2021

View reviewed changes

src/coreclr/jit/importer.cpp Outdated Show resolved Hide resolved

tannergooding force-pushed the math branch from 25c59c9 to 34bc1b0 Compare January 26, 2021 17:15

tannergooding added 10 commits January 28, 2021 08:37

Removing unecessary casts to float/double from the classlibnative COM…

230e3df

…Single and COMDouble types

Reordering NI_System_Math_* usages to make it easier to spot errors

86e6cd5

Removing unecessary support for Cos/Sin as "target intrinsics"

e2c5847

Adding asserts and changing target cost for math intrinsics to be bas…

7ba5da3

…ed on IsTargetIntrinsic

Don't include PAL math wrappers where no additional functionality is …

d99e691

…needed

Adding valuenum support for the remaining Math intrinsics, minus FMA …

eb84139

…and SinCos

Allow constant folding for all math intrinsics when not IsReadyToRun …

ce0afc2

…is false

Apply formatting patch

0f579b3

Adding THROW_DECL to match functions to satisfy GCC

5288742

noodes to nodes

9517552

tannergooding force-pushed the math branch from 34bc1b0 to 9517552 Compare January 28, 2021 16:45

Removing ScaleB named intrinsics

636529b

JulieLeeMSFT requested a review from briansull January 29, 2021 06:12

briansull approved these changes Jan 29, 2021

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved

Update src/coreclr/jit/codegenxarch.cpp

9e6e928

tannergooding added the auto-merge label Jan 29, 2021

ghost merged commit e8f09ed into dotnet:master Jan 29, 2021

ghost locked as resolved and limited conversation to collaborators Feb 28, 2021

tannergooding deleted the math branch July 1, 2025 14:39

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clean up some of the math intrinsic logic and support constant folding #47321

Clean up some of the math intrinsic logic and support constant folding #47321

Uh oh!

tannergooding commented Jan 22, 2021

tannergooding commented Jan 22, 2021 •

edited

Loading

tannergooding Jan 22, 2021

tannergooding commented Jan 22, 2021

Uh oh!

tannergooding commented Jan 28, 2021

jkotas commented Jan 28, 2021

AndyAyersMS commented Jan 28, 2021

briansull left a comment

Uh oh!

ghost commented Jan 29, 2021

tannergooding commented Jan 29, 2021

Labels

5 participants

Clean up some of the math intrinsic logic and support constant folding #47321

Clean up some of the math intrinsic logic and support constant folding #47321

Uh oh!

Conversation

tannergooding commented Jan 22, 2021

tannergooding commented Jan 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tannergooding Jan 22, 2021

Choose a reason for hiding this comment

tannergooding commented Jan 22, 2021

Uh oh!

tannergooding commented Jan 28, 2021

jkotas commented Jan 28, 2021

AndyAyersMS commented Jan 28, 2021

briansull left a comment

Choose a reason for hiding this comment

Uh oh!

ghost commented Jan 29, 2021

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

tannergooding commented Jan 29, 2021

Labels

5 participants

tannergooding commented Jan 22, 2021 •

edited

Loading

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (`@msftbot`) and give me an instruction to get started! Learn more here.