- Notifications
You must be signed in to change notification settings - Fork 1.9k
C++: Improve alias analysis for indirections #1736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| This produces the expected results on our tests, but I still need to test it (including performance) on real snapshots. |
| This is going to be very hard to review. Is there any way it can be split up into multiple commits that are reasonable in size and make sense on their own? |
| @jbj As long as the individual commits don't have to actually work in isolation, I can definitely split it up.
I'll have it split in a couple hours. |
Make `bitsToBytesAndBits` omit the leftover bits if zero.
Adds `InlineExpectationsTest.qll`, which makes it easy to create a QL test where the expectations are provided as comments in the source code being tested.
a710609 to 352e750 Compare | OK, I've curated all of the changes into separate commits. Many of those commits are pretty trivial. I've saved |
This uses the autoformatter from internal PR 34050. Some files that will change in github#1736 have been spared. ./build -j4 target/jars/qlformat find ql/cpp/ql -name "*.ql" -print0 | xargs -0 target/jars/qlformat --input find ql/cpp/ql -name "*.qll" -print0 | xargs -0 target/jars/qlformat --input (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/**/*SSA*.qll') buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest
Some files that will change in github#1736 have been spared. ./build -j4 target/jars/qlformat find ql/cpp/ql -name "*.ql" -print0 | xargs -0 target/jars/qlformat --input find ql/cpp/ql -name "*.qll" -print0 | xargs -0 target/jars/qlformat --input (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/**/*SSA*.qll') buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest
| /** | ||
| * A memory allocation that can be tracked by the AliasedSSA alias analysis. | ||
| * For now, we track all variables accessed within the function, including both local variables | ||
| * and global variables. In the future, we will track indirect parameters as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about modeled allocators or default new?
| private Overlap getVariableMemoryLocationOverlap(VariableMemoryLocation def, VariableMemoryLocation use) { | ||
| overlappingIRVariableMemoryLocations(def, use) and | ||
| result = Interval::getOverlap(def.getStartBitOffset(), def.getEndBitOffset(), use.getStartBitOffset(), use.getEndBitOffset()) | ||
| private Overlap getIndirectMemoryLocationOverlap(IndirectMemoryLocation def, IndirectMemoryLocation use) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was written carefully to avoid a quadratic blowup on https://lgtm.com/projects/g/ALaDyn/piccante/latest/files/src/sobol.cpp - can you make sure it's still performant?
| */ | ||
| private string getBaseAddressString(ValueNumber baseAddress) { | ||
| // If it's the value of a parameter, just use the parameter name. | ||
| result = baseAddress.getAnInstruction().(InitializeParameterInstruction).getParameter().toString() or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also special-case VariableAddressInstruction here?
| I recently asked @dbartol to remind me what this PR does.
|
This PR improves our alias analysis and SSA construction for access to memory via indirection, including fields of the indirection. Before, we could get precise flow through fields as long as we knew exactly which variable and which field was being accessed. Now, we can often tell that two accesses touch the same memory location, even if we don't know exactly which memory location that is.
Alias Analysis Changes
AliasAnalysis.qllis now parameterized base on aConfiguration, which defines a type namedAllocation.Allocationrepresents a contiguous region of memory that is disjoint from all other instances ofAllocation. Two memory accesses that belong to different allocations can never overlap. When building unaliased SSA,Allocationrepresents an automatic (stack) variable, since that is all we care to model. When building aliased SSA,Allocationrepresents aValueNumberfor aVariableAddressInstruction, but in the future will also includeInitializeParameterInstructions and possibly calls tomalloc-like functions.The entry points into
AliasAnalysis.qllare now:allocationEscapes(Configuration::Allocation allocation): Holds if the specified allocation's address escapes.addressOperandBaseAndConstantOffset(AddressOperand addrOperand, Instruction base, int bitOffset): Holds if the specifiedAddressOperandpoints to an address equal tobaseplus a constant offsetbitOffset. Two accesses with the same base address overlap if and only if their offsets overlap.getAddressOperandAllocation(AddressOperand addrOperand): Gets the allocation into whichaddrOperandpoints, if known.AliasedSSA.qllhas been changed to track the allocation and base address for every indirect memory access. The overlap relationship now uses the following rules to determine overlap:*pand*palways exactly overlap, andp->xnever overlapsp->y, even if we don't know exactly whatppoints to.SimpleSSA.qllhas been changed to track everything in terms ofAllocations, but since anAllocationinSimpleSSAis just anIRAutomaticVariable, nothing really changed here.Fixed handling of
PointerOffsetInstructionin alias analysis to only propagate the pointer operand.Test Changes
Added the first SSA sanity check, for operands with more than one associated
MemoryLocation, and tests that run this check on our existing IR test sources.Added a simple test for SSA for indirect parameter accesses.
Moved the
points_to.qltest to its own directory, and made it use a newInlineExpectationsTestclass. This class lets you define the results of your test in terms of (tag, value) pairs, each associated with a location. The base class looks for matching comments of the form$tag=valueon the appropriate line, and reports an output row for each missing or unexpected result. The ".expected" file should always be an empty file. This allows us to modify the test sources, including inserting lines, without getting a bunch of diffs just due to line numbers changing.Miscellaneous
Made
duplicateOperandhave the output of a problem query, including a link to the containing function.Made
IRVariablenon-abstract, so that it is now possible to extendIRVariableto restrict rather than extend.Fixed a bug where
PrintSSAwould print some properties on the wrong instruction.Refactored the integer constant code a bit. Bit offsets are now printed in the format
bytes:bits, where:bitsis omitted ifbitsis zero. Thus, 21 bits is printed as2:5, while 24 bits is printed as just3.