Skip to content

Conversation

loganek
Copy link
Contributor

@loganek loganek commented May 21, 2024

This pass wraps malloc/calloc/free/realloc around trace_{func_name} functions so we can keep track on all the allocations. I understand this pass won't work for some of the languages, but opening a PR in case someone finds it useful and worth enough including in wasm-opt.
Also, I don't have a ton of experience writing compiler passes, so any suggestion is very appreciated.

Related issue: #6548

(local.get $1)
)
)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option rather than instrumenting around each function call is to add new wrapper functions. That would be better for code size (existing calls remain the same size, they just point to the wrappers, and the wrappers are just a fixed increase in size). The inliner can inline them if the user then optimizes (and the inliner thinks it is worth it), which could make it just as fast.

But if size/speed are not important here then the current approach may be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally only intend to use it for debugging and not in the production code (at at least so far) so neither speed nor size is critical. I think the pass can be updated once there are different usecases.

@loganek loganek force-pushed the loganek/instrument-allocations branch from b00c3a4 to 007b403 Compare June 4, 2024 22:39
@loganek loganek force-pushed the loganek/instrument-allocations branch from 007b403 to 6666dd6 Compare June 7, 2024 17:23
@loganek loganek force-pushed the loganek/instrument-allocations branch from 6666dd6 to 919df07 Compare June 10, 2024 20:55
Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an explanation of the commandline format somewhere? If it's short enough in the --help that would be good, but I think it might not be, so the .cpp file perhaps.

However, a second question: Why does the commandline format have the types in there? It seems like we can detect the types from the function name? But I may have forgotten something here.

@loganek
Copy link
Contributor Author

loganek commented Jun 12, 2024

Is there an explanation of the commandline format somewhere? If it's short enough in the --help that would be good, but I think it might not be, so the .cpp file perhaps.

With the latest update where I removed the need for specifying the types, I think it's simple enough; there was already a help string defined in the TraceCalls.cpp in the getArgument() - it's not printed on --help, but it's displayed when the user tries to incorrectly use the pass. Do you think this should also be displayed when user calls --help? I'm not sure I saw it for other passes (e.g. extract-function-index).

However, a second question: Why does the commandline format have the types in there? It seems like we can detect the types from the function name? But I may have forgotten something here.

That's an excelent point; so initially the tracer was for pre-defined functions (malloc,calloc, realloc, free) but I changed that after the suggestion in the comment #6619 (comment) to make the pass more generic; I also blindly followed the recommendation to include types as parameters but it actually is not needed, and we can infer the type from the function being traced. I updated the PR now, thanks.

@loganek loganek force-pushed the loganek/instrument-allocations branch from c48e9d5 to d79bb54 Compare June 13, 2024 08:59
Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better!

@loganek loganek force-pushed the loganek/instrument-allocations branch from d79bb54 to 290409f Compare June 14, 2024 09:38
@loganek loganek force-pushed the loganek/instrument-allocations branch from 290409f to a69c304 Compare June 14, 2024 18:38
@loganek loganek force-pushed the loganek/instrument-allocations branch from a69c304 to 79e6d62 Compare June 18, 2024 06:45
@kripken
Copy link
Member

kripken commented Jun 18, 2024

Looks like there are some CI errors that need to be resolved before landing (edit: i see there was a force-push since, let's see if that fixes things edit edit: still some errors it seems)

@loganek loganek force-pushed the loganek/instrument-allocations branch 4 times, most recently from 13cb175 to 8120a53 Compare June 21, 2024 12:33
@loganek loganek force-pushed the loganek/instrument-allocations branch from 8120a53 to 892ebf9 Compare June 21, 2024 12:44
@loganek
Copy link
Contributor Author

loganek commented Jun 21, 2024

@kripken the build is fixed now. Tests were failing because I iterated through unordered_map to create imports which is non-deterministic; I switched to map which doesn't make significant difference in this specific case, but I'm curious how do you normally deal with this type of nondeterminism when you write LIT tests?

@kripken
Copy link
Member

kripken commented Jun 21, 2024

@loganek Iterating on a map here makes sense, yeah, as the keys are Names. In general we sometimes fix nondeterminism that way, if there is a natural key to use. If there aren't, say if the map is from Expression*, then we need something else (since pointers change in every execution), and then we have to find another way. One mechanism is InsertOrderedSet which uses the insertion order to be deterministic (assuming that order in fact is).

@kripken kripken merged commit a27d952 into WebAssembly:main Jun 21, 2024
@gkdn gkdn mentioned this pull request Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants