-   Notifications  You must be signed in to change notification settings 
- Fork 13.9k
lint ImproperCTypes: refactor linting architecture (part 3) #146275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at the final commit here "add recursion limit"; I think the method should be adjusted, but this can land pretty easily.
I don't think "add architecture for layered reasoning in lints" can land yet, there is a lot added here that is dead code and untested for now. Can it be moved to just before a commit that makes use of it?
Edit: actually, no need to move it anywhere specific yet since I'm still going through the commits at #134697 one-by-one. Just dropping it from this PR is sufficient, so the recursion limit can merge without being blocked.
| let _depth_guard = match self.can_enter_type(ty) { | ||
| Ok(guard) => guard, | ||
| Err(ffi_res) => return ffi_res, | ||
| }; | ||
| let tcx = self.cx.tcx; | ||
|  | ||
| // Protect against infinite recursion, for example | ||
| // `struct S(*mut S);`. | ||
| // FIXME: A recursion limit is necessary as well, for irregular | ||
| // recursive types. | ||
| if !self.cache.insert(ty) { | ||
| return FfiSafe; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow the original implementation here. If you have
extern "C" fn foo(a: NotFfiSafe, b: NotFfiSafe)Why is it that the b doesn't get marked FfiSafe since the cache insert will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is because the cache is set up per visitor.
 One of those gets set up for each extern static variable, each extern non-fnptr-function argument, each extern non-fnptr-function return, and each extern fnptr in a non-extern context.
 Here specifically, a and b each get a visitor.
| // borrows the refcell, outside of ImproperCTypesVisitorDepthGuard::drop() | ||
| let mut limiter_guard = self.recursion_limiter.borrow_mut(); | ||
| let (ref mut cache, ref mut depth) = *limiter_guard; | ||
| if (!cache.insert(ty)) || *depth >= 1024 { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the cache still necessary if you have the depth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache allows to stop processing regular-recursive types (say, some LinkedListNode struct) after a single iteration rather than after 1024.
 The depth limit is really needed for irregular recursive types, like in #130310.
| struct ImproperCTypesVisitor<'a, 'tcx> { | ||
| cx: &'a LateContext<'tcx>, | ||
| /// To prevent problems with recursive types, | ||
| /// add a types-in-check cache. | ||
| cache: FxHashSet<Ty<'tcx>>, | ||
| /// add a types-in-check cache and a depth counter. | ||
| recursion_limiter: RefCell<(FxHashSet<Ty<'tcx>>, usize)>, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably cleaner to continue passing this around as &mut and avoiding the RefCell if possible, which will make it easier to store more in the visitor in the future if needed. Is it possible to just pass a depth argument to the needed functions? This also avoids the guard complexity, and makes it easy to avoid resetting the depth if you need to construct a new ImproperCTypesVisitor.
Alternatively you can make a type RecursionCount = Rc<()> and store it in the visitor and clone it for each guard, which allows Rc::strong_count(that_rc) to tell you the recursion count. But that's not quite as clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess? I thought it would be annoying, performance-wise, to add 8 bytes to each method's stack. which... I end up half-doing anyway. huh.
 Sure, I'll add a "depth" argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, this suggestion was how it looks like we do this sort of thing in Miri and Clippy.
I thought it would be annoying, performance-wise, to add 8 bytes to each method's stack
Accounting for this would be an arch-specific extreme micro-optimization, no real need to think about that kind of thing :)
| // borrows the refcell, outside of ImproperCTypesVisitorDepthGuard::drop() | ||
| let mut limiter_guard = self.recursion_limiter.borrow_mut(); | ||
| let (ref mut cache, ref mut depth) = *limiter_guard; | ||
| if (!cache.insert(ty)) || *depth >= 1024 { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the limit we may as well use the global recursion limit here, I think that's available via cx.tcx.recursion_limit(). Should give you a Limit, and you can use value_within_limit to check 
rust/compiler/rustc_session/src/session.rs
Lines 65 to 87 in f4b2f68
| /// New-type wrapper around `usize` for representing limits. Ensures that comparisons against | |
| /// limits are consistent throughout the compiler. | |
| #[derive(Clone, Copy, Debug, HashStable_Generic)] | |
| pub struct Limit(pub usize); | |
| impl Limit { | |
| /// Create a new limit from a `usize`. | |
| pub fn new(value: usize) -> Self { | |
| Limit(value) | |
| } | |
| /// Create a new unlimited limit. | |
| pub fn unlimited() -> Self { | |
| Limit(usize::MAX) | |
| } | |
| /// Check that `value` is within the limit. Ensures that the same comparisons are used | |
| /// throughout the compiler, as mismatches can cause ICEs, see #72540. | |
| #[inline] | |
| pub fn value_within_limit(&self, value: usize) -> bool { | |
| value <= self.0 | |
| } | |
| } | 
| ☔ The latest upstream changes (presumably #146271) made this pull request unmergeable. Please resolve the merge conflicts. | 
971f0ab to 1cc9c7e   Compare   | I'm not sure I addressed everything you asked in reviews, but I figures I should push what I have so far (in both branches, (also I know there are be two commits that should be squashed together in the other PR, but I'll need to resolve a bunch of conflicts in the rest of the commit chain before doing this) | 
   This comment has been minimized. 
   
 This comment has been minimized.
1cc9c7e to cf704cf   Compare   cf704cf to c74c323   Compare   | This PR changes a file inside  | 
| //@ check-pass | ||
|  | ||
| //! this test checks that irregular recursive types do not cause stack overflow in ImproperCTypes | ||
| //! Issue: https://github.com/rust-lang/rust/issues/94223 | ||
| use std::marker::PhantomData; | ||
|  | ||
| #[repr(C)] | ||
| struct A<T> { | ||
| a: *const A<A<T>>, // without a recursion limit, checking this ends up creating checks for | ||
| // infinitely deep types the likes of `A<A<A<A<A<A<...>>>>>>` | ||
| p: PhantomData<T>, | ||
| } | ||
|  | ||
| extern "C" { | ||
| fn f(a: *const A<()>); | ||
| } | ||
|  | ||
| fn main() {} | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should also raise the lint here right? If so, a global #![deny(improper_ctypes)] with a local #[expect(improper_ctypes)] would be good to add.
Also I think the mustpass- test name prefix is only for tests named after issues (which we're trying to get rid of)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there shouldn't be an error here.
When the depth limit is reached, the type visitor assumes it managed to visit everything there was to visit. Therefore, if no error has be found until now, it must mean that the recursive pile of types is FFI-safe, somehow.
For the test names, this is something I fixed way down the commit chain, but yeah, I can at least give better names this for the lints I added myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right. I guess maybe it would just be good to deny(improper_ctypes, improper_ctypes_definitions) to verify that.
| struct ImproperCTypesVisitor<'a, 'tcx> { | ||
| cx: &'a LateContext<'tcx>, | ||
| /// To prevent problems with recursive types, | ||
| /// add a types-in-check cache. | ||
| cache: FxHashSet<Ty<'tcx>>, | ||
| /// add a types-in-check cache and a depth counter. | ||
| recursion_limiter: RefCell<(FxHashSet<Ty<'tcx>>, usize)>, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, this suggestion was how it looks like we do this sort of thing in Miri and Clippy.
I thought it would be annoying, performance-wise, to add 8 bytes to each method's stack
Accounting for this would be an arch-specific extreme micro-optimization, no real need to think about that kind of thing :)
| //@ check-pass | ||
|  | ||
| #![recursion_limit = "5"] | ||
| #![allow(unused)] | ||
| #![deny(improper_ctypes)] | ||
|  | ||
| #[repr(C)] | ||
| struct F1(*const ()); | ||
| #[repr(C)] | ||
| struct F2(*const ()); | ||
| #[repr(C)] | ||
| struct F3(*const ()); | ||
| #[repr(C)] | ||
| struct F4(*const ()); | ||
| #[repr(C)] | ||
| struct F5(*const ()); | ||
| #[repr(C)] | ||
| struct F6(*const ()); | ||
|  | ||
| #[repr(C)] | ||
| struct B { | ||
| f1: F1, | ||
| f2: F2, | ||
| f3: F3, | ||
| f4: F4, | ||
| f5: F5, | ||
| f6: F6, | ||
| } | ||
|  | ||
| extern "C" fn foo(_: B) {} | ||
|  | ||
| fn main() {} | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is this expected to test? There doesn't seem to be any recursion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this would be good to run twice with different limits:
//@ revisions: limit5 limit10 #![deny(improper_ctypes)] #![cfg_attr(limit6, recursion_limit = "10")] #![cfg_attr(limit5, recursion_limit = "5")]Then add something that passes with the limit of 10 but needs #[cfg_attr(limit5, expect(improper_ctypes))] to pass with a limit of 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why git is saying I wrote this test, it comes from 9050b33.
 let me check what it was about...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah got it.
 it's linked to pull #130758, so the test is here to check that the next attempt to add a recursion limit wouldn't make the same mistakes as the previous one
Another interal change that shouldn't impact rustc users. The goal is to break apart the gigantic visit_type function into more managable and easily-editable bits that focus on specific parts of FFI safety.
c74c323 to 2ec7eab   Compare   | This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. | 
   This comment has been minimized. 
   
 This comment has been minimized.
Another user-transparent change, unifying outer-type information and the existing VisitorState flags.
Simple change to stop irregular recursive types from causing infinitely-deep recursion in type checking.
2ec7eab to 85b72da   Compare   
This is the third PR in an effort to split #134697 (refactor plus overhaul of the ImproperCTypes family of lints) into individually-mergeable parts.
Contains:
Fixes: #130310
Superset of: #146271 and its superset #146273