Skip to content

Conversation

@kdjdbbfk
Copy link

@kdjdbbfk kdjdbbfk commented Sep 6, 2024

This commit enhances the performance of the MD5 functionality on ARM64 architecture by optimizing the ROUND3 function in the md5block_arm64.s assembly file.

  • Refactored the ROUND3 macro to improve the computation order, introducing a new ROUND3FIRST macro to handle the initial calculation more efficiently.
  • Optimized the XOR operations in the ROUND3 macro to reduce unnecessary instructions and improve parallelism within the ARM64 architecture.

Performance testing was conducted on an ARM64 Linux machine using Go's benchmark tool. TThe benchmarks were run 10 times each to ensure statistical significance and with a single CPU core. The following results were observed:
goos: linux
goarch: arm64
pkg: md5
cpu: HUAWEI,Kunpeng 920

 │ baseline.txt │ new.txt │ │ sec/op │ sec/op vs base │ Hash8Bytes 163.3n ± 0% 162.8n ± 0% -0.34% (p=0.000 n=10) Hash64 280.1n ± 2% 279.8n ± 0% -0.09% (p=0.001 n=10) Hash128 398.4n ± 0% 397.6n ± 0% -0.21% (p=0.017 n=10) Hash256 634.6n ± 1% 633.3n ± 0% -0.21% (p=0.000 n=10) Hash512 1.106µ ± 0% 1.105µ ± 0% -0.09% (p=0.000 n=10) Hash1K 2.053µ ± 0% 2.052µ ± 0% -0.05% (p=0.001 n=10) Hash8K 15.27µ ± 1% 15.27µ ± 0% -0.04% (p=0.000 n=10) Hash1M 1.942m ± 0% 1.936m ± 0% -0.31% (p=0.002 n=10) Hash8M 15.61m ± 0% 15.62m ± 0% ~ (p=1.000 n=10) Hash8BytesUnaligned 162.6n ± 0% 162.6n ± 0% ~ (p=0.555 n=10) Hash1KUnaligned 2.068µ ± 0% 2.066µ ± 0% -0.10% (p=0.000 n=10) Hash8KUnaligned 15.36µ ± 0% 15.36µ ± 0% ~ (p=0.168 n=10) geomean 4.465µ 4.460µ -0.12% │ baseline.txt │ new.txt │ │ B/s │ B/s vs base │ Hash8Bytes 46.72Mi ± 0% 46.88Mi ± 0% +0.36% (p=0.000 n=10) Hash64 217.9Mi ± 2% 218.1Mi ± 0% +0.09% (p=0.000 n=10) Hash128 306.4Mi ± 0% 307.0Mi ± 0% +0.23% (p=0.017 n=10) Hash256 384.7Mi ± 1% 385.5Mi ± 0% +0.21% (p=0.000 n=10) Hash512 441.6Mi ± 0% 441.9Mi ± 0% +0.07% (p=0.000 n=10) Hash1K 475.6Mi ± 0% 475.8Mi ± 0% +0.05% (p=0.000 n=10) Hash8K 511.5Mi ± 1% 511.7Mi ± 0% +0.04% (p=0.000 n=10) Hash1M 515.0Mi ± 0% 516.6Mi ± 0% +0.32% (p=0.001 n=10) Hash8M 512.3Mi ± 0% 512.3Mi ± 0% ~ (p=1.000 n=10) Hash8BytesUnaligned 46.94Mi ± 0% 46.93Mi ± 0% ~ (p=0.754 n=10) Hash1KUnaligned 472.2Mi ± 0% 472.7Mi ± 0% +0.11% (p=0.000 n=10) Hash8KUnaligned 508.7Mi ± 0% 508.7Mi ± 0% ~ (p=0.158 n=10) geomean 291.9Mi 292.3Mi +0.12% 

When testing with large files (e.g., a 3GB file), the runtime was reduced from 8.65 seconds to 7.39 seconds, resulting in an approximate 9% reduction in execution time. This demonstrates a more significant performance gain when handling larger datasets.

Overall, these optimizations provide modest improvements for small input sizes and more noticeable performance benefits when processing larger files, especially in memory-intensive workloads like file hashing.

@gopherbot
Copy link
Contributor

This PR (HEAD: 67f8686) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/611299.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.
@gopherbot
Copy link
Contributor

Message from Gopher Robot:

Patch Set 1:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: 85ec85f) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/611299.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.
@kdjdbbfk kdjdbbfk changed the title crypto/md5: Improve ARM64 MD5 performance by optimizing ROUND3 function crypto/md5: improve ARM64 MD5 performance by optimizing ROUND3 function Sep 6, 2024
@gopherbot
Copy link
Contributor

Message from 赵静玉:

Patch Set 1:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: 3149567) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/611299.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.
@gopherbot
Copy link
Contributor

Message from qiu laidongfeng2:

Patch Set 3: Commit-Queue+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 3:

Dry run: CV is trying the patch.

Bot data: {"action":"start","triggered_at":"2024-11-01T13:17:37Z","revision":"1f01c61828f4e615469e12f05e9228a2fdf18049"}


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from qiu laidongfeng2:

Patch Set 3: -Commit-Queue


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 3:

This CL has failed the run. Reason:

Tryjob golang/try/x_tools-gotip-linux-amd64 has failed with summary (view all results):


Build or test failure, click here for results.

To reproduce, try gomote repro 8732467483322381745.

Additional links for debugging:


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 3: LUCI-TryBot-Result-1


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Dmitri Shuralyov:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Meng Zhuo:

Patch Set 3: Code-Review+1

(2 comments)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from 赵静玉:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from 赵静玉:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Meng Zhuo:

Patch Set 3:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

This PR (HEAD: ca90f5a) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/611299.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.
@gopherbot
Copy link
Contributor

Message from Gopher Robot:

Patch Set 4:

Congratulations on opening your first change. Thank you for your contribution!

Next steps:
A maintainer will review your change and provide feedback. See
https://go.dev/doc/contribute#review for more info and tips to get your
patch through code review.

Most changes in the Go project go through a few rounds of revision. This can be
surprising to people new to the project. The careful, iterative review process
is our way of helping mentor contributors and ensuring that their contributions
have a lasting impact.

During May-July and Nov-Jan the Go project is in a code freeze, during which
little code gets reviewed or merged. If a reviewer responds with a comment like
R=go1.11 or adds a tag like "wait-release", it means that this CL will be
reviewed as part of the next development cycle. See https://go.dev/s/release
for more details.


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

This commit enhances the performance of the MD5 functionality on ARM64 architecture by optimizing the ROUND3 function in the `md5block_arm64.s` assembly file. 1.Refactored the `ROUND3` macro to improve the computation order, introducing a new `ROUND3FIRST` macro to handle the initial calculation more efficiently. 2.Optimized the XOR operations in the `ROUND3` macro to reduce unnecessary instructions and improve parallelism within the ARM64 architecture. Performance testing was conducted on an ARM64 Linux machine using Go's benchmark tool. The benchmarks were run 10 times each to ensure statistical significance and with a single CPU core. The following results were observed: goos: linux goarch: arm64 pkg: md5 cpu: HUAWEI,Kunpeng 920 │ baseline.txt │ new.txt │ │ sec/op │ sec/op vs base │ Hash8Bytes 163.3n ± 0% 162.8n ± 0% -0.34% (p=0.000 n=10) Hash64 280.1n ± 2% 279.8n ± 0% -0.09% (p=0.001 n=10) Hash128 398.4n ± 0% 397.6n ± 0% -0.21% (p=0.017 n=10) Hash256 634.6n ± 1% 633.3n ± 0% -0.21% (p=0.000 n=10) Hash512 1.106µ ± 0% 1.105µ ± 0% -0.09% (p=0.000 n=10) Hash1K 2.053µ ± 0% 2.052µ ± 0% -0.05% (p=0.001 n=10) Hash8K 15.27µ ± 1% 15.27µ ± 0% -0.04% (p=0.000 n=10) Hash1M 1.942m ± 0% 1.936m ± 0% -0.31% (p=0.002 n=10) Hash8M 15.61m ± 0% 15.62m ± 0% ~ (p=1.000 n=10) Hash8BytesUnaligned 162.6n ± 0% 162.6n ± 0% ~ (p=0.555 n=10) Hash1KUnaligned 2.068µ ± 0% 2.066µ ± 0% -0.10% (p=0.000 n=10) Hash8KUnaligned 15.36µ ± 0% 15.36µ ± 0% ~ (p=0.168 n=10) geomean 4.465µ 4.460µ -0.12% │ baseline.txt │ new.txt │ │ B/s │ B/s vs base │ Hash8Bytes 46.72Mi ± 0% 46.88Mi ± 0% +0.36% (p=0.000 n=10) Hash64 217.9Mi ± 2% 218.1Mi ± 0% +0.09% (p=0.000 n=10) Hash128 306.4Mi ± 0% 307.0Mi ± 0% +0.23% (p=0.017 n=10) Hash256 384.7Mi ± 1% 385.5Mi ± 0% +0.21% (p=0.000 n=10) Hash512 441.6Mi ± 0% 441.9Mi ± 0% +0.07% (p=0.000 n=10) Hash1K 475.6Mi ± 0% 475.8Mi ± 0% +0.05% (p=0.000 n=10) Hash8K 511.5Mi ± 1% 511.7Mi ± 0% +0.04% (p=0.000 n=10) Hash1M 515.0Mi ± 0% 516.6Mi ± 0% +0.32% (p=0.001 n=10) Hash8M 512.3Mi ± 0% 512.3Mi ± 0% ~ (p=1.000 n=10) Hash8BytesUnaligned 46.94Mi ± 0% 46.93Mi ± 0% ~ (p=0.754 n=10) Hash1KUnaligned 472.2Mi ± 0% 472.7Mi ± 0% +0.11% (p=0.000 n=10) Hash8KUnaligned 508.7Mi ± 0% 508.7Mi ± 0% ~ (p=0.158 n=10) geomean 291.9Mi 292.3Mi +0.12% When testing with large files (e.g., a 3GB file), the runtime was reduced from 8.65 seconds to 7.39 seconds, resulting in an approximate 9% reduction in execution time. This demonstrates a more significant performance gain when handling larger datasets. Overall, these optimizations provide modest improvements for small input sizes and more noticeable performance benefits when processing larger files, especially in memory-intensive workloads like file hashing.
@gopherbot
Copy link
Contributor

This PR (HEAD: 0cf4001) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/611299.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.
@gopherbot
Copy link
Contributor

Message from 赵静玉:

Patch Set 5:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from 赵静玉:

Patch Set 6: Code-Review+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from 赵静玉:

Patch Set 6:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Meng Zhuo:

Patch Set 6: Code-Review+1 Commit-Queue+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 6:

Dry run: CV is trying the patch.

Bot data: {"action":"start","triggered_at":"2024-11-07T09:21:10Z","revision":"f4e646ac50a24c1637962bc188511e08c2d57b0a"}


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Meng Zhuo:

Patch Set 6: -Commit-Queue


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 6:

This CL has passed the run


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 6: LUCI-TryBot-Result+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Meng Zhuo:

Patch Set 6: -Code-Review


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Russ Cox:

Patch Set 6:

(2 comments)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Russ Cox:

Patch Set 6: Hold+1

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/611299.
After addressing review feedback, remember to publish your drafts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants