Skip to content

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Mar 24, 2025

#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 24, 2025

// CHECK-LABEL: @array_of_tuple_le
#[no_mangle]
pub fn array_of_tuple_le(a: &[(i16, u16); 2], b: &[(i16, u16); 2]) -> bool {
Copy link
Member Author

@scottmcm scottmcm Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nightly, note the setl+setg from the cmp methods:

array_of_tuple_le:  movzwl (%rdi), %eax  movzwl (%rsi), %ecx  cmpw %cx, %ax  setl %cl  setg %al  jne .LBB0_1  movzwl 2(%rdi), %eax  cmpw 2(%rsi), %ax  seta %al  sbbb $0, %al  testb %al, %al  jne .LBB0_7  jmp .LBB0_4 .LBB0_1:  subb %cl, %al  testb %al, %al  jne .LBB0_7 .LBB0_4:  movzwl 4(%rdi), %eax  movzwl 4(%rsi), %ecx  cmpw %cx, %ax  setl %cl  setg %al  jne .LBB0_5  movzwl 6(%rdi), %eax  cmpw 6(%rsi), %ax  seta %al  sbbb $0, %al  jmp .LBB0_7 .LBB0_5:  subb %cl, %al .LBB0_7:  testb %al, %al  setle %al  retq

With this PR:

array_of_tuple_le: movzwl	(%rcx), %eax movzwl	(%rdx), %r8d cmpw	%r8w, %ax jne.LBB1_1 movzwl2(%rcx), %r8d movzwl2(%rdx), %r9d cmpw	%r9w, %r8w jne.LBB1_2 movzwl4(%rcx), %eax movzwl4(%rdx), %r8d cmpw	%r8w, %ax jne.LBB1_1 movzwl6(%rcx), %r8d movzwl6(%rdx), %r9d movb$1, %al cmpw	%r9w, %r8w jne.LBB1_2 retq .LBB1_1: cmpw	%r8w, %ax setle	%al retq .LBB1_2: cmpw	%r9w, %r8w setb	%al retq

That's so nice -- straight line with no jumps taken if they're equal, otherwise jumps to the check it needs (dedup'd by consistent register choices), saves that, and returns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that even with #133984 landed, we still don't optimize away the [su]cmps in nightly today: https://rust.godbolt.org/z/ss5E7q6PT.

Comment on lines +55 to +61
// This is certainly not the obvious way to implement these methods.
// Unfortunately, using anything that looks at the discriminant means that
// LLVM sees a check for `2` (aka `ControlFlow<bool>::Continue(())`) and
// gets very distracted by that, ending up generating extraneous code.
// This should be changed to something simpler once either LLVM is smarter,
// see <https://github.com/llvm/llvm-project/issues/132678>, or we generate
// niche discriminant checks in a way that doesn't trigger it.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's a backlink: llvm/llvm-project#132678

@scottmcm
Copy link
Member Author

Oh, I'm glad this passes in LLVM18 too.

As the one with context from #138135,
r? @Mark-Simulacrum

@scottmcm scottmcm marked this pull request as ready for review March 24, 2025 23:54
@scottmcm scottmcm removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 25, 2025
@bors
Copy link
Collaborator

bors commented Mar 26, 2025

☔ The latest upstream changes (presumably #138956) made this pull request unmergeable. Please resolve the merge conflicts.

@scottmcm scottmcm force-pushed the more-chaining-ord branch from 17ab2d3 to 877f8de Compare March 31, 2025 07:21
@rust-log-analyzer

This comment has been minimized.

@scottmcm scottmcm force-pushed the more-chaining-ord branch from 877f8de to cf991d9 Compare March 31, 2025 07:53
@Mark-Simulacrum
Copy link
Member

@bors r+

@bors
Copy link
Collaborator

bors commented Apr 12, 2025

📌 Commit cf991d9 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 12, 2025
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 12, 2025
…-Simulacrum Use the chaining methods on PartialOrd for slices too rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 12, 2025
…enton Rollup of 9 pull requests Successful merges: - rust-lang#137494 (libstd: init(): dup() subsequent /dev/nulls instead of opening them again) - rust-lang#138881 (Use the chaining methods on PartialOrd for slices too) - rust-lang#138972 (std: Fix build for NuttX targets) - rust-lang#139107 (std: make `cmath` functions safe) - rust-lang#139607 (Add regression test for rust-lang#127424) - rust-lang#139691 (Document that `opt-dist` requires metrics to be enabled) - rust-lang#139707 (Fix comment in bootstrap) - rust-lang#139708 (Fix name of field in doc comment) - rust-lang#139709 (bootstrap: fix typo in doc string) r? `@ghost` `@rustbot` modify labels: rollup
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 12, 2025
…-Simulacrum Use the chaining methods on PartialOrd for slices too rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 13, 2025
…enton Rollup of 8 pull requests Successful merges: - rust-lang#138881 (Use the chaining methods on PartialOrd for slices too) - rust-lang#138972 (std: Fix build for NuttX targets) - rust-lang#139107 (std: make `cmath` functions safe) - rust-lang#139607 (Add regression test for rust-lang#127424) - rust-lang#139691 (Document that `opt-dist` requires metrics to be enabled) - rust-lang#139707 (Fix comment in bootstrap) - rust-lang#139708 (Fix name of field in doc comment) - rust-lang#139709 (bootstrap: fix typo in doc string) r? `@ghost` `@rustbot` modify labels: rollup
@ChrisDenton
Copy link
Member

Looks like this failed in rollup #139723 (comment)

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 13, 2025
@scottmcm scottmcm force-pushed the more-chaining-ord branch from cf991d9 to 756670f Compare April 13, 2025 05:10
@scottmcm
Copy link
Member Author

scottmcm commented Apr 13, 2025

Doh, I hard-coded i64 in GEP without realizing it's not a 64-bit-only test. Fixed by just accepting i32|i64 there.

@scottmcm
Copy link
Member Author

@bors r=Mark-Simulacrum rollup=iffy (I think I fixed the codegen test, but who knows if there's something else I missed)

@bors
Copy link
Collaborator

bors commented Apr 13, 2025

📌 Commit 756670f has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 13, 2025
@bors
Copy link
Collaborator

bors commented Apr 13, 2025

⌛ Testing commit 756670f with merge 53d4476...

@bors
Copy link
Collaborator

bors commented Apr 13, 2025

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 53d4476 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 13, 2025
@bors bors merged commit 53d4476 into rust-lang:master Apr 13, 2025
7 checks passed
@rustbot rustbot added this to the 1.88.0 milestone Apr 13, 2025
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 65fa0ab (parent) -> 53d4476 (this PR)

Test differences

Show 2 test diffs

Stage 1

  • [codegen] tests/codegen/array-cmp.rs: pass -> ignore (ignored when randomizing layouts ((checks depend on tuple layout))) (J1)

Stage 2

  • [codegen] tests/codegen/array-cmp.rs: pass -> ignore (ignored when randomizing layouts ((checks depend on tuple layout))) (J0)

Job group index

Job duration changes

  1. dist-x86_64-apple: 7777.8s -> 9709.6s (24.8%)
  2. dist-riscv64-linux: 4855.1s -> 5246.0s (8.1%)
  3. x86_64-rust-for-linux: 2603.4s -> 2789.6s (7.2%)
  4. dist-various-2: 3351.8s -> 3564.8s (6.4%)
  5. dist-x86_64-freebsd: 5100.2s -> 4813.9s (-5.6%)
  6. dist-loongarch64-linux: 6252.1s -> 6552.8s (4.8%)
  7. x86_64-gnu-llvm-19-1: 5196.3s -> 5438.3s (4.7%)
  8. x86_64-apple-2: 6367.8s -> 6619.8s (4.0%)
  9. aarch64-gnu-debug: 4088.9s -> 4226.1s (3.4%)
  10. dist-x86_64-mingw: 7564.4s -> 7809.2s (3.2%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (53d4476): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 2.8%, secondary 8.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.8% [2.8%, 2.8%] 1
Regressions ❌
(secondary)
8.3% [8.3%, 8.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.8% [2.8%, 2.8%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.1%, 0.1%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.5%, -0.1%] 7
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.2% [-0.5%, 0.1%] 9

Bootstrap: 779.489s -> 780.32s (0.11%)
Artifact size: 365.53 MiB -> 365.51 MiB (-0.01%)

@scottmcm scottmcm deleted the more-chaining-ord branch April 13, 2025 17:25
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Apr 19, 2025
…imulacrum Use the chaining methods on PartialOrd for slices too rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.

7 participants