This ports DoubleToNum and supporting code to be a managed implementation. #19999

tannergooding · 2018-09-17T08:29:05Z

This ports the rest of the formatting code to be written in managed code. It is, currently, mostly a naive port of the native code and only has a few minor fix ups to the code.

I locally ran the Roslyn RealParser suite, as well as did a basic benchmark on both float and double, covering 267,386,880 values in the input range (including both denormal and normal inputs).

Benchmarking was done with Tiered Jitting disabled.

For the Roslyn Real Parser Tests

This was mostly to validate that I didn't mess up any code during the port. Their own parsing code takes up the majority in each, but overall has a 1.52% regression in elapsed time.

Native

Managed

For float.ToString() on 267,386,880 values between float.MinValue and float.MaxValue

Grisu3.DigitGen is the major player in both here. This showed a 9.89% regression in elapsed time. float.ToString() will default to 7 digits.

Native

Managed

For double.ToString() on 267,386,880 values between double.MinValue and double.MaxValue

Grisu3.DigitGen is again the major time taker here. This showed a 16.43% regression in elapsed time. double.ToString() will default to 15 digits.

Native

Managed

tannergooding · 2018-09-17T08:36:04Z

Looking at the perf gaps, some of it is just things that the C++ Compiler gives you for free, that we don't have enabled quite yet (like combined DivMod) or where the native code was using a more optimized algorithm (like memcpy and memset), or where they have intrinsics which we don't have yet (like BitScanReverse).

These should, however, be easy enough to fix up.

For example:

Native

Managed

tannergooding · 2018-09-17T08:37:03Z

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs

+ _length = (upper == 0) ? 1 : 2;
+ }
+
+ public static uint BitScanReverse(uint mask)


Native code just calls the intrinsic instead.

tannergooding · 2018-09-17T08:37:58Z

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs

+ private int _length;
+ private BlocksBuffer _blocks;
+
+ public BigInteger(uint value)


Native code can configure the default constructor to efficiently initialize this, rather than requiring us to specialize case value == 0

tannergooding · 2018-09-17T08:41:11Z

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs

+ _length += (int)(blocksToShift);
+
+ // Zero the remaining low blocks
+ Buffer.ZeroMemory((byte*)(_blocks.GetPointer()), (blocksToShift * sizeof(uint)));


This was not being done properly in native code (size was blocksToShift, rather than blocksToShift * sizeof(uint)).

Without Grisu3 enabled as the first code-path, the Roslyn RealParser tests exposed this issue.

tannergooding · 2018-09-17T08:41:59Z

src/System.Private.CoreLib/shared/System/Number.DiyFp.cs

+
+ public void Multiply(ref DiyFp rhs)
+ {
+ ulong lf = _f;


This is just a 64x64=128 multiply, so using an intrinsic is possible.

tannergooding · 2018-09-17T08:44:53Z

src/System.Private.CoreLib/shared/System/Number.Grisu3.cs

+ decimalExponent = CachedPowerDecimalExponents[index];
+ }
+
+ private static bool DigitGen(ref DiyFp mp, int precision, char* digits, out int length, out int k)


This is the function that is causing the most perf diff between the managed and native implementations.

We could likely close the gap by caching some field/properties we lookup multiple times (into locals), and by using Math.DivRem (or better, a proper intrinsic), rather than independent divide and remainder operations.

tannergooding · 2018-09-17T08:46:58Z

bcltype/bignum.cpp was ported as part of this, which will make it easier to test porting the roslyn RealParser.

tannergooding · 2018-09-17T17:32:21Z

After a small cleanup to DigitGen, the perf regression (for doubles) is down to 11.18% (from 16.43%)

tannergooding · 2018-09-17T17:33:29Z

CC. @danmosemsft, @jkotas

src/classlibnative/bcltype/number.cpp

danmoseley · 2018-09-17T17:50:32Z

Do you propose we commit this as is or work to close the gap further? I suggest the latter?

jkotas · 2018-09-17T17:53:54Z

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs

+ }
+
+ [StructLayout(LayoutKind.Sequential, Size = (MaxBlockCount * sizeof(uint)))]
+ private struct BlocksBuffer


fixed uint _buffer[MaxBlockCount] ?

Won't that cause a perf hit due to UnsafeValueTypeAttribute?

You can measure it.

You are using unsafe stackallocated buffers, so you do want to pay for the GS cookie checks. I assume that the C/C++ code paid for them too.

We should move to a C# 7.3 compiler version (we are still on 2.8.0-beta4 right now) so that we can index a fixed-buffer without pinning: https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-7-3#indexing-fixed-fields-does-not-require-pinning

dotnet/buildtools#2163

This is blocked more generally on: #19878 (comment).

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs

jkotas · 2018-09-17T18:11:38Z

Do you propose we commit this as is or work to close the gap further

The regression is relatively small (~10%, depends on how you measure it). I think it is fine to take this into master since this moves us in the right direction, and unblocks other work. (And it is paying back the shortcut that was made during implementation of the faster number float formatting algorithms earlier.)

src/System.Private.CoreLib/shared/System/Number.Formatting.cs

tannergooding · 2018-09-17T19:58:32Z

Fixing up Buffer::ZeroMemory (which is currently a naive while loop) brings it down to 64.579s (or a 5.84% regression).
-- This can be a separate PR.

danmoseley · 2018-09-17T20:59:06Z

I think it is fine to take this into master since this moves us in the right direction

sounds good to me

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs

danmoseley · 2018-09-17T21:34:18Z

BitScanReverse and BitScanReverse64 are now dead in pal.h

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs

tannergooding · 2018-09-17T22:27:16Z

BitScanReverse and BitScanReverse64 are now dead in pal.h

Removed.

tannergooding · 2018-09-17T22:29:38Z

The current numbers, with everything currently up in this PR (5.68% regression):

tannergooding · 2018-09-17T22:31:11Z

Everything except for moving BigInteger to use fixed-sized buffers has been addressed. Moving to fixed-sized buffers is just pending a BuildTools update to move us to C# 7.3 (so we don't have to scatter fixed statements around the code, or add "hacks" to work around it).

danmoseley · 2018-09-17T22:44:05Z

@tannergooding what do you plan to do about tests? You found at least one bug we weren't catching. Does it make sense to port the Roslyn ones into CoreFX?

tannergooding · 2018-09-17T22:53:27Z

what do you plan to do about tests? You found at least one bug we weren't catching. Does it make sense to port the Roslyn ones into CoreFX?

Until parsing is also fixed, we can't port the Roslyn test-bed directly. I'm working on pulling out some of the tests, however, so that we do have more coverage on important areas.

tannergooding · 2018-09-19T20:16:32Z

@jkotas, are you fine with this being merged and with me logging a bug/following up with a second PR to move to fixed-sized buffers after the build-tools change (bringing in C# 7.3 support) goes in, or would you rather this just wait and be done all at once?

src/System.Private.CoreLib/shared/System/Double.cs

jkotas · 2018-09-19T20:59:20Z

I would add a unused fixed field to struct BlocksBuffer so that it gets marked properly as structure containing unchecked buffers.

Otherwise, I am fine with this being merged and logging bug/following up with second PR later.

…gInteger.cs

…Fp.cs

tannergooding · 2018-09-19T21:37:38Z

I would add a unused fixed field to struct BlocksBuffer so that it gets marked properly as structure containing unchecked buffers.

Fixed.

jkotas · 2018-09-19T22:30:27Z

src/System.Private.CoreLib/shared/System/Number.Formatting.cs

+
+ private static long ExtractFractionAndBiasedExponent(double value, out int exponent)
+ {
+ long fraction = GetMantissa(value);


You are likely losing some cycles by converting double to long twice in a row.

Fixed.
-- I wish we had the ability to specify a function was const so that the JIT could just do the right/expected thing here.

tannergooding · 2018-09-20T17:06:31Z

@jkotas, any other feedback you would like me to address? I think I've gotten it all at this point.

jkotas · 2018-09-20T17:31:32Z

src/System.Private.CoreLib/shared/System/Number.DiyFp.cs

+ _e = e;
+ }
+
+ public ulong f


Nit: There is no value in these being properties. Just makes the JIT to work harder.

I'll track this is part of the fixed buffer cleanup after the build-tools update.

Logged https://github.com/dotnet/coreclr/issues/20077

jkotas · 2018-09-20T17:33:10Z

Do the perf numbers still look good with all the feedback incorporated?

jkotas

LGTM

tannergooding · 2018-09-20T18:07:30Z

Do the perf numbers still look good with all the feedback incorporated?

Yes. The current run shows the best numbers so far (only a 3.93% regression from the original 61.02s):

Noting that there is some noise here. I have also seen it take up to 65.83s, which is a 7.89% regression
I plan on looking at benchview after this pumps back to CoreFX as well, for another comparison point

jkotas · 2018-09-20T20:39:38Z

@tannergooding CoreCLR/CoreFX convention is to merge using "Squash and Merge". https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/contributing.md#merging-pull-requests-for-contributors-with-write-access (next time...)

tannergooding · 2018-09-20T20:52:04Z

CoreCLR/CoreFX convention is to merge using "Squash and Merge". https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/contributing.md#merging-pull-requests-for-contributors-with-write-access (next time...)

Can we set that to be the default then? We have multiple repos across dotnet and many of them do not agree on this behavior. As such, I normally just use whatever the default merge option is configured for (which is currently Rebase and Merge)

jkotas · 2018-09-20T21:00:31Z

Where do you set it as default?

tannergooding · 2018-09-20T21:09:10Z

It should be under Settings, it looks like they've changed it around since the last time I looked:
I believe https://help.github.com/articles/configuring-commit-squashing-for-pull-requests/ has more details

jkotas · 2018-09-20T21:13:35Z

Right, that let's you choose what is allowed, not what should be the default.

tannergooding commented Sep 17, 2018

View reviewed changes

jkotas reviewed Sep 17, 2018

View reviewed changes

src/classlibnative/bcltype/number.cpp Outdated Show resolved Hide resolved

jkotas reviewed Sep 17, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs Outdated Show resolved Hide resolved

jkotas reviewed Sep 17, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.Formatting.cs Outdated Show resolved Hide resolved

tannergooding mentioned this pull request Sep 17, 2018

Updating Buffer.ZeroMemory to call SpanHelpers.ClearWithoutReferences #20014

Merged

jkotas reviewed Sep 17, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs Outdated Show resolved Hide resolved

danmoseley reviewed Sep 17, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.BigInteger.cs Show resolved Hide resolved

jkotas reviewed Sep 19, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Double.cs Outdated Show resolved Hide resolved

Porting bcltype/bignum.cpp to managed code as shared/System/Number.Bi…

70d24cc

…gInteger.cs

tannergooding added 9 commits September 19, 2018 14:27

Porting the Dragon4 algorithm to managed code.

780c56e

Removing the Dragon4 and DoubleToNumber native implementation.

014f635

Porting bcltype/diyfp.cpp to managed code as shared/System/Number.Diy…

4b01978

…Fp.cs

Porting the Grisu3 algorithm to managed code.

8cab64e

Removing the Grisu3 native implementation.

1c3c68b

Making Number.Grisu3.DigitGen slightly more efficient.

b1fada5

Removing bcltype/fp.h from native code.

6fb39fd

Fixing some naming conventions and removing dead code.

24bdfae

Removing BitScanReverse from pal.h

c27ad27

jkotas reviewed Sep 19, 2018

View reviewed changes

Moving GetExponent/Mantissa and make BigInteger used fixed-sized buffer

b5e3986

jkotas reviewed Sep 20, 2018

View reviewed changes

jkotas approved these changes Sep 20, 2018

View reviewed changes

tannergooding merged commit 97e8b44 into dotnet:master Sep 20, 2018

tannergooding mentioned this pull request Sep 20, 2018

Porting NumberToDouble to managed code. #20080

Merged

EgorBo mentioned this pull request Sep 23, 2018

[WIP][corlib] Managed NumberToDouble and DoubleToNumber implementations from CoreCLR mono/mono#10763

Closed

EgorBo mentioned this pull request Oct 3, 2018

cherry-pick Number changes to mono/corefx from coreclr (shared) mono/corefx#146

Closed

tannergooding mentioned this pull request Oct 18, 2019

Address bugs in BigInteger #27280

Merged

tannergooding mentioned this pull request Jan 31, 2020

Cleanup the managed DoubleToNum implementation after build-tools update dotnet/runtime#11130

Closed

This ports DoubleToNum and supporting code to be a managed implementation. #19999

This ports DoubleToNum and supporting code to be a managed implementation. #19999

Uh oh!

Conversation

tannergooding commented Sep 17, 2018

For the Roslyn Real Parser Tests

For float.ToString() on 267,386,880 values between float.MinValue and float.MaxValue

For double.ToString() on 267,386,880 values between double.MinValue and double.MaxValue

tannergooding commented Sep 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding commented Sep 17, 2018

tannergooding commented Sep 17, 2018

tannergooding commented Sep 17, 2018

Uh oh!

danmoseley commented Sep 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkotas Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

jkotas commented Sep 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tannergooding commented Sep 17, 2018

danmoseley commented Sep 17, 2018

Uh oh!

danmoseley commented Sep 17, 2018

Uh oh!

tannergooding commented Sep 17, 2018

tannergooding commented Sep 17, 2018

tannergooding commented Sep 17, 2018

danmoseley commented Sep 17, 2018

tannergooding commented Sep 17, 2018

tannergooding commented Sep 19, 2018

Uh oh!

jkotas commented Sep 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tannergooding commented Sep 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding commented Sep 20, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkotas commented Sep 20, 2018

jkotas left a comment

Choose a reason for hiding this comment

tannergooding commented Sep 20, 2018

jkotas commented Sep 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

tannergooding commented Sep 20, 2018

jkotas commented Sep 20, 2018

tannergooding commented Sep 20, 2018

jkotas commented Sep 20, 2018

Labels

3 participants

jkotas Sep 17, 2018 •

edited

Loading

jkotas commented Sep 17, 2018 •

edited

Loading

jkotas commented Sep 19, 2018 •

edited

Loading

jkotas commented Sep 20, 2018 •

edited

Loading