-   Notifications  You must be signed in to change notification settings 
- Fork 13.9k
Open
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-heavyIssue: Problems and improvements with respect to binary size of generated code.Issue: Problems and improvements with respect to binary size of generated code.
Description
I tried this code:
#[unsafe(no_mangle)] pub fn foo(f: fn([[[[i32; 16]; 16]; 16]; 16])) { let x = [[[[1i32; 16]; 16]; 16]; 16]; f(x); }I expected this to compile into a loop of some kind. But when I compiled this with -Copt-level=3, I got a huuuuge completely-unrolled loop:
Part of the generated assembly
.LCPI0_0:  .long 1  .long 1  .long 1  .long 1 foo:  mov r11, rsp  sub r11, 262144 .LBB0_1:  sub rsp, 4096  mov qword ptr [rsp], 0  cmp rsp, r11  jne .LBB0_1  sub rsp, 8  mov rax, rdi  movaps xmm0, xmmword ptr [rip + .LCPI0_0]  movaps xmmword ptr [rsp], xmm0  movaps xmmword ptr [rsp + 16], xmm0  movaps xmmword ptr [rsp + 32], xmm0  movaps xmmword ptr [rsp + 48], xmm0  movaps xmmword ptr [rsp + 64], xmm0  movaps xmmword ptr [rsp + 80], xmm0  movaps xmmword ptr [rsp + 96], xmm0 ; A large amount of assembly code omitted for brevity  movaps xmmword ptr [rsp + 15936], xmm0  movaps xmmword ptr [rsp + 15952], xmm0  movaps xmmword ptr [rsp + 15984], xmm0  movaps xmmword ptr [rsp + 16000], xmm0 ; A large amount of assembly code omitted for brevity  movaps xmmword ptr [rsp + 262048], xmm0  movaps xmmword ptr [rsp + 262064], xmm0  movaps xmmword ptr [rsp + 262080], xmm0  movaps xmmword ptr [rsp + 262096], xmm0  movaps xmmword ptr [rsp + 262112], xmm0  movaps xmmword ptr [rsp + 262128], xmm0  movaps xmmword ptr [rsp + 15968], xmm0  mov rdi, rsp  call rax  add rsp, 262152  retI also tried compiling the following code locally with cargo build --release (since godbolt has a compilation time limit), and I got a 153 MB rlib.
#[unsafe(no_mangle)] pub fn foo(f: fn([[[[[[[[[[i32; 16]; 16]; 16]; 16]; 16]; 16]; 16]; 16]; 16]; 16])) { let x = [[[[[[[[[[1i32; 16]; 16]; 16]; 16]; 16]; 16]; 16]; 16]; 16]; 16]; f(x); }Meta
rustc version on godbolt:
rustc 1.88.0 (6b00bc388 2025-06-23) binary: rustc commit-hash: 6b00bc3880198600130e1cf62b8f8a93494488cc commit-date: 2025-06-23 host: x86_64-unknown-linux-gnu release: 1.88.0 LLVM version: 20.1.5 Internal compiler ID: r1880 rustc version locally:
rustc 1.88.0 (6b00bc388 2025-06-23) binary: rustc commit-hash: 6b00bc3880198600130e1cf62b8f8a93494488cc commit-date: 2025-06-23 host: aarch64-apple-darwin release: 1.88.0 LLVM version: 20.1.5 @rustbot labels +I-heavy
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchI-heavyIssue: Problems and improvements with respect to binary size of generated code.Issue: Problems and improvements with respect to binary size of generated code.