Skip to content

Conversation

@hppritcha
Copy link
Member

With sessions initialization model (section 11.3 of MPI 4 standard) MPI may be initialized and finalized any number of times.

This patch refactors code that was assuming a one shot init/finalize sequence for initializing Open MPI and its MCA param space

The underlying problem with the replaced code was that when an app calls MPI_Session_finalize and there are no more sessions active, the MCA param space is destroyed. So if one does not build Open MPI to use dynamically load components, and a component is using static variables in a way that assumes the MCA param space is always preserved if a static variable is set to some value, then things break if a subsequent MPI_Session_init is invoked.

Related to #12869

With sessions initialization model (section 11.3 of MPI 4 standard) MPI may be initialized and finalized any number of times. This patch refactors code that was assuming a one shot init/finalize sequence for initializing Open MPI and its MCA param space The underlying problem with the replaced code was that when an app calls MPI_Session_finalize and there are no more sessions active, the MCA param space is destroyed. So if one does not build Open MPI to use dynamically load components, and a component is using static variables in a way that assumes the MCA param space is always preserved if a static variable is set to some value, then things break if a subsequent MPI_Session_init is invoked. Related to open-mpi#12869 Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
@hppritcha hppritcha force-pushed the fix_common_ofi_for_sessions branch from a9a334d to 2f4ba1b Compare October 18, 2024 20:15
@hppritcha hppritcha requested review from a team, cpshereda, edgargabriel and jsquyres and removed request for a team, cpshereda and jsquyres October 18, 2024 20:20

if (0 > accelerator_rank_index) {

param = mca_base_var_find("opal", "opal_common", "ofi", "accelerator_rank");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, there is a case that could be made that this should not be an ofi parameter, but an mca parameter of the accelerator framework (for a later PR)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point. I was just keeping things the way they were. I've no objection to moving it in the end to accel framework.

@hppritcha hppritcha merged commit c393881 into open-mpi:main Oct 23, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants