- Notifications
You must be signed in to change notification settings - Fork 5.2k
Remove BMI2 intrinsics from ASCII and UTF-8 transcoding logic #31904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove BMI2 intrinsics from ASCII and UTF-8 transcoding logic #31904
Conversation
| I'll run against both of my AMD boxes (1800X and 3900X) later today and will share back results. |
| Numbers for the Ryzen 9 3900X Baseline, BMI Enabled Baseline, BMI Disabled This PR |
| I'll run 3900X results tonight and post, as that box my personal one at home. |
| The fuzzing run is now in progress. The fuzzer is running over the combination of this PR with #32036 cherry-picked. For reference, the fuzzer target application is https://github.com/GrabYourPitchforks/utf8fuzz. There are two runs going in parallel: one with full intrinsic enabled, and one with SSE4.1+ disabled. |
| Numbers for the Ryzen 9 3900X Baseline, BMI Enabled Baseline, BMI Disabled This PR |
| Also updated the 1800X numbers above to include Baseline, BMI Disabled |
| Removed no-merge label as fuzzing run has been going all weekend with no problem. |
src/libraries/System.Private.CoreLib/src/System/Text/ASCIIUtility.cs Outdated Show resolved Hide resolved
src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf8Utility.Helpers.cs Show resolved Hide resolved
tannergooding left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
I don't think we need the assert that we are little endian on x86 and the intrinsic code could likely benefit from comments in a few places for abitrary users reading the code.
Remove BMI2 from ASCII and UTF-16 processing hot paths, as not all processors have optimized implementations of pext/pdep
Remove BMI2 from ASCII and UTF-16 processing hot paths, as not all processors have optimized implementations of pext/pdep
Contributes to #2251. For the most part, BMI2 instructions have been replaced with SIMD instructions where possible. Where the replacement is not easily doable the BMI2 instruction was removed entirely. Performance chart follows.
Most of the performance characteristics are approximately the same. There's a small perf gain in the Alice in Wonderland text (11-0.txt) for transcoding each direction. There's a small perf loss in the CJK sample (25249-0.txt).
Marked as no-merge because I haven't finished fuzzing it yet. It'll take a few days for the run to finish.
Note to reviewers: hiding whitespace changes will make this easier to review.