Skip to content

Conversation

@hvitved
Copy link
Contributor

@hvitved hvitved commented Oct 6, 2022

As discussed with @asgerf offline, we have decided to model flow through initialize constructors in the following way:

  • When we see a call to C.new(x1, ..., xn), pretend that it is actually a call to C#initialize(x1, ..., xn), but only if C does not define it's own self.new method (in which case we dispatch to that instead).
  • Introduce a new return kind, NewReturnKind, and
    • let all calls to new accept only NewReturnKind (and not NormalReturnKind),
    • let user-defined self.new methods have return kind NewReturnKind instead of NormalReturnKind, and
    • let initialize methods have NewReturnKind for all post-update accesses to self (and still have NormalReturnKind for normal returns, in case someone were to invoke initialize directly).

This means that if we have

class C def initialize(x) @field = x end end C.new(y)

then y will flow into the parameter x of initalize, and from there to (post-update) self [field]. And since (post-update) self has NewReturnKind, we will get flow back out to C.new(y) [field].

@github-actions github-actions bot added the Ruby label Oct 6, 2022
@hvitved hvitved force-pushed the ruby/initialize branch 2 times, most recently from 72c1ebf to f25495e Compare October 7, 2022 08:42
Copy link
Contributor

@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We might want to add a case for Class.allocate() too. The allocate method creates a new object, but does not call initialize on it.

@hvitved hvitved force-pushed the ruby/initialize branch 4 times, most recently from 722f0fb to 373ed9f Compare October 10, 2022 12:11
@hvitved hvitved marked this pull request as ready for review December 14, 2022 18:39
@hvitved hvitved requested a review from a team as a code owner December 14, 2022 18:39
asgerf
asgerf previously approved these changes Dec 15, 2022
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some inline suggestions, which are probably best left for future PRs.

Looking at the evaluation results, we lose some call edges and a few sinks, but on the whole we gain much more than we lose. So I'd say it's still better to merge now than to leave the PR hanging.

pragma[nomagic]
private predicate hasUserDefinedSelf(Module m) {
// cannot use `lookupSingletonMethod` due to negative recursion
singletonMethodOnModule(_, "new", m.getSuperClass*()) // not `getAnAncestor` because singleton methods cannot be included
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming this predicate to hasUserDefinedNew.

Could it be implemented in terms of isUserDefinedNew defined earlier in this file?


Also, we seem to lose some call edges due to C.new not being seen as an instance of C when C has an override of self.new, but where the implementation of self.new always returns self.allocate.

I think a reasonably simple fix would be to redefine this predicate to look for overriden new that doesn't obviously return self.allocate:

Suggested change
singletonMethodOnModule(_, "new", m.getSuperClass*()) // not `getAnAncestor` because singleton methods cannot be included
exists(DataFlow::MethodNode method |
// not `getAnAncestor` because singleton methods cannot be included
singletonMethodOnModule(method.asCallableAstNode(), "new", m.getSuperClass*()) and
not method.getSelfParameter().getAMethodCall("allocate").flowsTo(method.getAReturningNode())
)

But I'm OK with postponing this fix so as not to delay the PR any further.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot; thanks for checking. Let's fix follow-up.

exists(RelevantCall call0, Callable res |
call0 = call.asCall() and
res = result.asCallable() and
result = viableSourceCallable(call) and // make sure to not include e.g. private methods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This result = viableSourceCallable(call) clause seems to have been lost in this change - is that deliberate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is replaced by the res = getTargetInstance(call0, name) calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants