- Notifications
You must be signed in to change notification settings - Fork 5.2k
Keep precise argument sizes between import and morph. #43130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9d5efd5 to c986f65 Compare 0a0669b to 115e44c Compare GenTreePutArgSmall. src/coreclr/src/jit/importer.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we did not see a warning about an assignment inside if.
src/coreclr/src/jit/morph.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
callSig == nullptr means mostly a helper call, all helper calls that I have checked are working with i4 and i8 types but I expect I will have to add more PUTARG_TYPE there once I run more tests on arm64 OSX and find bugs with missed type information.
| PTAL @CarolEidt @AndyAyersMS @dotnet/jit-contrib |
| I'm puzzled why we need to do things this way -- doesn't the callee sig tell us what size the callee expects? |
We get this information from callee sig, but it is available only during importation and then we lose it. In debug we save it in if we decide to keep it in Release it will increase the size of all large nodes. |
| Seems like we could deal with that easily enough, we have various extensions already hanging off of calls. For instance we could refactor Is there other benefit to having these extra IR nodes beyond being able to handle this ABI quirk? |
These extra nodes are only for the ABI quirk, I do not see benefits to other places from it. I was thinking to postpone casts creations but it was a long shot (if we pass 1 byte as 1 byte we don't need
|
CarolEidt left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments and suggestions
src/coreclr/src/jit/gentree.h Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why you added this here - that is, I can see why we wouldn't expect to encounter these under a COMMA, but I'm not sure why this special assert is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have several similar functions like gtEffectiveVal, gtRetExprVal and gtSkipPutArgType. When we call each of them we often don't expect nodes that can be skipped by two others to be on top.
For example, when we call gtEffectiveVal we don't expect something like GT_RET_EXPR(GT_COMMA) or GT_PUTARG_TYPE(GT_COMMA), similar for gtRetExprVal it should not see GT_PUTARG_TYPE(GT_PUTARG_TYPE).
If we do it means the caller did not check for these nodes and we don't skip what we hoped to skip, it doesn't cause correctness issues in my understanding but introduces asm regressions when we can't parse something like we did before the change.
src/coreclr/src/jit/morph.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be !=? Otherwise, the loop below will never execute, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a temporary condition, the idea here is to match the signature with the call arguments as we have them in morph and check that we still have the correct types. However, it is tricky to do because we could have some nodes added to gtCallArgs between importation and call to this function (also this function can be called several times).
So we need to skip all additional nodes that were added and I have problems with identifying them. I am planning to fix it once I pass all the other tests.
5e87bfb to 6b0b884 Compare 0fcee3c to ada8c7f Compare | /azp run runtime-coreclr outerloop |
| Azure Pipelines successfully started running 1 pipeline(s). |
| /azp run runtime-coreclr outerloop |
| Azure Pipelines successfully started running 1 pipeline(s). |
sandreenko left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR was updated and is passing the tests now.
Thanks to @AndyAyersMS for helping me offline with the previous test failures.
The changes include #44790 that I hope to merge first.
This is the last big JIT change for arm64 apple, with it we will have precise arguments sizes from importer to codegen. The next changes will be on VM side for interop stubs.
src/coreclr/src/jit/importer.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main change is that we save PUTARG_TYPE and RET_EXPR in inlCurArgInfo->argNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int32 is passed as int16 to Inline1 where it is passed to CallAMethod1, before the last changes the saved node was LCL_VAR int and we added another PUTARG_TYPE on top of an existing one, now it is fixed.
src/coreclr/src/jit/flowgraph.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new condition allows us to always the argument as a temp with the correct type, we need it because for a correct direct substitution we need to know if the parent is a call or not but we don't have this information.
We create the node when we see an IL opcode like ldlocal and don't have parent information, later, when the node is used, we can't distinguish it from a regular LCL_VAR.
The change causes a small regression on all platforms:
windows x64 crossgen: 5134 (0.02% of base) windows x64 pmi: 14911 (0.03% of base) windows arm64 crossgen: 5688 (0.01% of base) linux x64 crossgen: 9036 (0.02% of base) , etc. all lower than 0.04%.
What happens there is:
- during importation we create a
LCL_VAR tempForOurArg, save a pointer to it; - in
fgInsertInlineeBlockswe iterate over all such pointers and do in-place replacement if there was only one use of the lclVar, but we don't know the user, so we can't say if it is a call that need aPUTARGor not.
I have not found a simple way to avoid it or compensate for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a comment in the code here explaining why argHasPutArg disqualifies this arg from direct substitution.
AndyAyersMS left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left you some questions about this change.
src/coreclr/src/jit/flowgraph.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this change for correctness? If not perhaps you should PR it separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto for the other two below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not need this for correctness. I am planning to merge them in #44790 (comment), if we decide not to merge it I will need to change this code to use argNode->gtSkipPutArgType()->gtRetExprVal().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just approved #44790.
src/coreclr/src/jit/flowgraph.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a comment in the code here explaining why argHasPutArg disqualifies this arg from direct substitution.
src/coreclr/src/jit/flowgraph.cpp Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See note above; I don't believe you can set these to zero like this and pass jit stress.
5b28654 to 05d3c2e Compare | /azp run runtime-coreclr outerloop |
| Azure Pipelines successfully started running 1 pipeline(s). |
AndyAyersMS left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to merge once you've updated for #44790 and perhaps added a comment indicating why PUTARG_TYPE can't be directly substituted.
64db1b2 to 133ef0d Compare Create `GT_PUTARG_TYPE` when signature type does not match node type. Check in morph that this information has survived inlining and other phases between.
133ef0d to 8c7ecdb Compare | I have checked that the diffs are still as expected before the merge, PR and jitstress are clear, |
| Thanks @AndyAyersMS and @CarolEidt for your help with this change. |
On arm64 OSX we need to know precise argument sizes. I have added support for that between morph and lowering in #42503, this PR supports in between importation and morph.
When do we need it:
without the change, morph was seeing the incorrect type of the nodes but for other platforms it was not important (because all arguments were passed in TARGET_POINTER_SIZE slots).
I have stopped on the solution that adds a special GT_OPER to keep this information between these phases and a debug code in morph to check that it has survived phases between them (mostly inlining). Thanks @CarolEidt for the idea.
I want to enable this code for all platforms because:
I was also considering other solutions, the main 2:
fgArgInfo;impImplicitIorI4Castto insert cast not only between i4 and i8 but between all sizes, butNo asm diffs (SPMI x64/x86).Update: ci shows related failures during the check phase, I will fix them soon, but I think it should not block review.