Soundness fix: respect `read_scalar` errors in `read_from_const_alloc`. #353

eddyb · 2025-07-29T19:52:15Z

^{(If you've seen me mention how various work in this area, like #348, or even #350 or #341 which have already landed, is somewhat of a prerequisite, or at least arose during some bug fix, this is it, but I decided this is far too important to block on any other improvements, so this PR contains its most minimal form I can think of - the benefits from extra work is mostly diagnostics, but correctness comes first)}

The method read_from_const_alloc(_at) (suffixed after this PR) is responsible for reading the components of a SPIR-V constant of some type ty at an offset in some constant alloc (i.e. a miri/CTFE memory allocation, so a mix of plain bytes, symbolic pointers, and uninitialized memory).

The first problem was it using &mut for offset, and its mutation being relied on for both auto-advancing (e.g. between the components of a SPIR-V vector/matrix), but also for some ad-hoc checks.

So the first commit in this PR refactors it to:

strongly rely on passing down a read-only offset (structs/arrays adding field/element sub-offsets to it)
have only 4 groups of cases: primitive (int/float/pointer leaves), structs, array(-like), and unsupported
return an overall Size for the constant value that was read, alongside that value
- for sized types ty, this is guaranteed (and checked) to equal the size of ty
  (i.e. cx.lookup_type(ty).sizeof(cx) == Some(read_size))
- for unsized types ty, this mimics Rust mem::size_of_val,
  (if ty is, or ends in [T], this will fit as many T elements as possible in alloc.size(),
  after offset, so it'll almost always be the whole alloc, minus at most a gap smaller than T)
replace the separate create_const_alloc function (which used to check the final offset w/ an assert_eq!) with an Option-returning try_read_from_const_alloc that checks the read Size against alloc.size()
- the main reason to check the size is to avoid truncating some &CONST because of pointer casts
  (e.g. if ARRAY[i] is equivalent to *ARRAY.as_ptr().add(i), and .as_ptr() is a &[T; N] -> *T cast,
  you really don't want that to become *(const { &{ARRAY[0]} } as *const T).add(i) and UB for i > 0)
- the opportunistic read_from_const_alloc in const_bitcast (the main way &CONST gets a type) already
  fits the conditional nature of try_read_from_const_alloc (and other refactors break w/o such a check)
- only non-&CONST use of create_const_alloc was for the initializer of statics, and that can always
  unwrap the try_read_from_const_alloc (initializer alloc is always the size of the static's type)

Most of that refactor isn't, strictly speaking, necessary right now (other than making the code less fragile/error-prone), but it's a much cleaner solution than all the workarounds I had previously come up with, downstream of the soundness fix (and e.g. #348 + calling const_bitcast from pointercast in more cases).

The big soundness issue, however, was that read_from_const_alloc, for primitive (int/float/pointer) leaves, would call alloc.read_scalar(offset, ...), and treat Err(_) as "undef value at that location".

But a whole undef value is a very specific case, while the returned AllocError can indicate:

some bytes of the value are uninitialized
- only if every single byte of the value is uninitialized, can the value be undef
some bytes of the value are pointer, and either:
- a non-pointer (int/float) type is being read
- a pointer type is being read, but the read range only partially covers the pointer
  (i.e. alloc has a pointer that starts just before/after offset, but not at offset exactly)

Unsoundness arises from spurious undef (OpUndef in SPIR-V) being misused instead of reporting an error, because it's designed to be ignored by optimizations (or even routine transformations like control-flow structurization), and treated like it can take on any value (i.e. it makes it UB to care about the exact value).

Even worse, Rust-GPU is prone to attempt to represent constant data as e.g. [u32; N], and if the alloc contains any pointers, reading them as u32 will result in Err(AllocError::ReadPointerAsInt(_)), and before this PR the pointers would silently be ignored and turned into uninitialized memory.

So the second commit in this PR actually handles the AllocError, and only uses a plain undef when all bytes are uninitialized, all other cases being errors - with the caveat that doing more work to produce the correct constant may be possible in some cases, but I haven't put too much effort into it.

For now, the one special-case is that it does try to turn "whole pointer attempted to be read as an usize" errors into ptr->int const_bitcasts (of the actual pointers) instead, which doesn't do much in terms of debuggability, just yet, but future work to improve const_bitcast does help here.

_{In theory, OpSpecConstantOp would let us represent e.g. only some bits being OpUndef/some pointer, by mixing constants using bitwise ops (e.g. (undef << 24) | ((ptr as u32) >> 8)), but it's more likely we'll first get more untyped constant data, than ever need this.}

…ating offsets.

eddyb · 2025-07-29T20:05:24Z

crates/rustc_codegen_spirv/src/codegen_cx/constant.rs

- let value = if offset.bytes() == 0 {
- base_addr
- } else {
- self.tcx
- .dcx()
- .fatal("Non-zero scalar_to_backend ptr.offset not supported")
- // let offset = self.constant_bit64(ptr.offset.bytes());
- // self.gep(base_addr, once(offset))
- };
- if let Primitive::Pointer(_) = layout.primitive() {
- assert_ty_eq!(self, value.ty, ty);
- value
- } else {
- self.tcx
- .dcx()
- .fatal("Non-pointer-typed scalar_to_backend Scalar::Ptr not supported");
- // unsafe { llvm::LLVMConstPtrToInt(llval, llty) }
- }
+ self.const_bitcast(self.const_ptr_byte_offset(base_addr, offset), ty)


This drive-by change is the minimal subset of backporting this (not yet landed) upstream PR:

Move scalar_to_backend to ssa rust-lang/rust#142960

I already have later changes to this code that bring it even closer to that version, but I only did this part for the special-case of "reading a Scalar::Ptr as an integer", because the old assert_ty_eq! would fail (while compiling libcore, IIRC?) even though the whole point is to end in a const_bitcast regardless of what ty is.

LegNeato

More straightforward than the other ones! I agree, a result would be better, but I wouldn't block landing this on it.

LegNeato · 2025-07-30T01:50:43Z

crates/rustc_codegen_spirv/src/codegen_cx/constant.rs

+ /// returning that constant if its size covers the entirety of `alloc`.
+ //
+ // FIXME(eddyb) should this use something like `Result<_, PartialRead>`?
+ pub fn try_read_from_const_alloc(


Yeah, I think this would be better with something like Result<(SpirvValue, Size), ReadConstError>

enum ReadConstError { PartialRead { read: Size, expected: Size, }, Unsupported(&str), InvalidLayout(String), Zombie(String), }

and having the other fns return it too, giving something like:

match read_from_const_alloc_at(...) { Ok((val, size)) if size == alloc.size() => Ok(val), Ok((_, size)) => Err(ReadConstError::PartialRead { read: size, expected: alloc.size(), }), Err(e) => Err(e), }

There's good reasons for the way this works, for the two kinds of "errors":

individual errors read_from_const_alloc_at turns into "zombies"

these are like Rust diagnostics (but deferred in the "zombie" way)

eventually all of these will go away anyway (worst case showing up as qptr diagnostics)

Result is an anti-pattern for diagnostics, because it stops at the first error
(which can be fine for leaves, but anything recursive runs into "error buffering" needs)

None potentially being returned by try_read_from_const_alloc

this isn't even an error, maybe I should've named it try_read_whole_const_alloc

the size check prevents the opportunistic replacement of &CONST_A w/ &CONST_B
(if CONST_B is effectively some prefix of CONST_A, i.e. a truncation)

if we wanted to e.g. make const_fold_load more flexible, this wouldn't be used
(instead, read_from_const_alloc_at would be called directly and always succeed)

eddyb added 2 commits July 29, 2025 03:35

Track const sizes in {create,read_from}_const_alloc, instead of mut…

41d1bcf

…ating offsets.

Respect read_scalar errors in read_from_const_alloc.

2a2fc59

eddyb requested review from Firestar99, LegNeato and schell as code owners July 29, 2025 19:52

eddyb enabled auto-merge July 29, 2025 19:59

eddyb commented Jul 29, 2025

View reviewed changes

eddyb mentioned this pull request Jul 29, 2025

Make SpirvValue(Kind) representation and operations more orthogonal and robust (lossless). #348

Draft

LegNeato approved these changes Jul 30, 2025

View reviewed changes

eddyb added this pull request to the merge queue Jul 30, 2025

Merged via the queue into Rust-GPU:main with commit cf59e54 Jul 30, 2025
13 checks passed

eddyb deleted the fix-read_scalar-unsoundness branch July 30, 2025 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Soundness fix: respect `read_scalar` errors in `read_from_const_alloc`. #353

Soundness fix: respect `read_scalar` errors in `read_from_const_alloc`. #353

Uh oh!

eddyb commented Jul 29, 2025

eddyb Jul 29, 2025

LegNeato left a comment

LegNeato Jul 30, 2025 •

edited

Loading

eddyb Jul 30, 2025

Uh oh!

Labels

2 participants

Soundness fix: respect read_scalar errors in read_from_const_alloc. #353

Soundness fix: respect read_scalar errors in read_from_const_alloc. #353

Uh oh!

Conversation

eddyb commented Jul 29, 2025

eddyb Jul 29, 2025

Choose a reason for hiding this comment

LegNeato left a comment

Choose a reason for hiding this comment

LegNeato Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

eddyb Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

2 participants

Soundness fix: respect `read_scalar` errors in `read_from_const_alloc`. #353

Soundness fix: respect `read_scalar` errors in `read_from_const_alloc`. #353

LegNeato Jul 30, 2025 •

edited

Loading