Skip to content

Conversation

@9il
Copy link
Member

@9il 9il commented Dec 10, 2020

No description provided.

@9il 9il changed the title WIP: Ryu Ryu Dec 13, 2020
@9il 9il merged commit ac24185 into master Dec 13, 2020
@9il 9il deleted the ryu branch December 13, 2020 10:59
@cassioneri
Copy link

cassioneri commented Dec 17, 2020

When working on my PR [1], I've also considered a supposedly optimisation which, I believe, is what you mention in [2] and concerns the earlier exit from multipleOfPowerOf5's loop [3].

I agree that earlier exit, when we already know that the function should return, avoids needless calculations. It's also natural to believe this is an optimisation. However, this this is not obvious since it requires a test which translates to a conditional jump. I'm not saying that this is definitely not an optimisation. I'm saying this needs more investigation and evidence to draw this conclusion. My experimentation suggested that, actually, this is a pessimisation. For this reason, I've dropped my attempt.

To be more concrete, please, look at [4] and [5].

[4] compares the original code with my alternative (alternative1) and yours (alternative2). We can see the assembly code for the original implementation contains 1 conditional jump (line 9), mine contains 3 (lines 16, 22 and 24) and yours contains 4 (lines 37, 3945 and 47).

[5] shows a (non scientific) benchmark comparing the three implementations. The results are aligned with the idea that conditional jumps degrade performance.

Obviously, [4] and [5] needs to be taken with a pinch of salt since the code is not written in D, it doesn't work on 128-bits unsigned integers and (likely) was compiled with a different compiler.

Finally, Ulf's original implementation contains "assert(value != 0)" [6]. Hence, I conclude that value is not supposed to be 0 and you could remove "if (value)" from your implementation (you might want to assert it though), keeping the if-branch and removing the else-one. Doing that in [4] shows that a conditional jump disappears. Doing that in [5] brings the performance of your code closer to mine (still both worse than the original). However, if you prefer keeping "if (value)", then I think the else-branch should be "return true;" rather than "return p == 0;" [8]. The reason is that the function is meant to tell whether value is multiple of 5^p and 0 is multiple of 5^p regardless of p, since 0 is multiple of any number.

I hope this helps.

[1] ulfjack/ryu#188
[2] ulfjack/ryu#194
[3] https://github.com/libmir/mir-algorithm/blob/master/source/mir/bignum/internal/ryu/generic_128.d#L192
[4] https://godbolt.org/z/ExvW1z
[5] https://quick-bench.com/q/k5U0tRnnKOcCTpoFk203aKzQMd0
[6] https://github.com/ulfjack/ryu/blob/master/ryu/d2s_intrinsics.h#L194
[7] https://github.com/libmir/mir-algorithm/blob/master/source/mir/bignum/internal/ryu/generic_128.d#L187
[8] https://github.com/libmir/mir-algorithm/blob/master/source/mir/bignum/internal/ryu/generic_128.d#L199

@9il
Copy link
Member Author

9il commented Dec 20, 2020

Thank you so much! Applied at c57fc5c

Obviously, [4] and [5] needs to be taken with a pinch of salt since the code is not written in D, it doesn't work on 128-bits unsigned integers and (likely) was compiled with a different compiler.

LDC uses the LLVM backend. We can expect it generates exactly the same code.
For single and double FP numbers D generic_128 implementation uses 64-bit integers for vp, vm, and vr (but the same 128/256-bit tables as the original 128-bit implementation). All you wrote is absolutely correct for D implementation as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants