[ENH] Optimize workflow.run performance #3260
Merged
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Summary
A few months back, I submitted pull request #3184 to improve the performance of using
connect
when creating large workflows. Specifically, I had discovered that the use of theinputs
oroutputs
properties of workflows can create a performance bottleneck if there are many child nodes or nested workflows.Recently, I noticed that the same bottleneck can cause a delay between calling
workflow.run()
and the start of the actual execution, meaning when nodes and interfaces start to run.Running cProfile suggests that the delay occurs in

_create_flat_graph
. Note that the profile does not include the full workflow execution, but was cancelled immediately when the first node started to run.As far as I can tell, before execution starts, nested workflows are merged into one overall workflow using
_create_flat_graph
. To resolve the final connections between nodes in this merged workflow,_create_flat_graph
calls_get_parameter_node
for each input from or output to a nested workflow, and then modifies the connection information accordingly.nipype/nipype/pipeline/engine/workflows.py
Lines 975 to 979 in e9217c2
As a result, for each connection to/from a nested workflow,
_get_parameter_node
constructs the entireinputs
oroutputs
data structure of the nested workflow, and then uses it to resolve the correct connection information. Just as for #3184, constructing this entire data structure over and over again for each connection can reduce performance.List of changes proposed in this PR (pull-request)
Instead of generating the full
inputs
oroutputs
data structure, I propose that the_get_parameter_node
function should traverse the individual workflow graphs until it finds the target node (or not).I have created a quick implementation that leads to a significant speedup. This implementation is a slightly modified copy of the code from #3184.

I hope that this code will be useful for the nipype community.
Acknowledgment