Skip to content

Conversation

@hjelmn
Copy link
Member

@hjelmn hjelmn commented Feb 9, 2016

This commit temporarily disables the dynamic add_procs support on
big-endian and 32-bit systems. There is an issue with the sentinel
used to indicate when proc needs to be generated that causes issues on
these systems. This will be removed once the underlying issue has been
resolved.

Fixes #1348

Signed-off-by: Nathan Hjelm hjelmn@lanl.gov

This commit temporarily disables the dynamic add_procs support on big-endian and 32-bit systems. There is an issue with the sentinel used to indicate when proc needs to be generated that causes issues on these systems. This will be removed once the underlying issue has been resolved. Fixes open-mpi#1348 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@hjelmn
Copy link
Member Author

hjelmn commented Feb 9, 2016

@ggouaillardet, @bosilca At today's call we decided this was the best way forward to not delay 2.0.0. This gives us time to work on the correct long-term solution.

@hjelmn hjelmn added the bug label Feb 9, 2016
@hjelmn hjelmn added this to the v2.0.0 milestone Feb 9, 2016
@hjelmn hjelmn self-assigned this Feb 9, 2016
@ggouaillardet
Copy link
Contributor

@hjelmn this does not disable cutoff, it only disables it by default (e.g. it is possible to re-enable it)
for example, on a 32 bits arch

$ OMPI_MCA_mpi_add_procs_cutoff=10 ~/local/ompi-master-m32/bin/ompi_info --all | grep add_procs_cutoff MCA mpi: parameter "mpi_add_procs_cutoff" (current value: "10", data source: environment, level: 3 user/all, type: unsigned_int) 

and running with
mpirun --mca mpi_add_procs_cutoff 0 ...
will cause a SIGSEGV

here is a patch that makes the value read-only (and cause mpirun to fail if it is forced)

diff --git a/ompi/runtime/ompi_mpi_params.c b/ompi/runtime/ompi_mpi_params.c index c20562c..df55a8c 100644 --- a/ompi/runtime/ompi_mpi_params.c +++ b/ompi/runtime/ompi_mpi_params.c @@ -276,7 +276,13 @@ int ompi_mpi_register_params(void) "Maximum world size for pre-allocating resources for all " "remote processes. Increasing this limit may improve " "communication performance at the cost of memory usage", - MCA_BASE_VAR_TYPE_UNSIGNED_INT, NULL, 0, 0, OPAL_INFO_LVL_3, + MCA_BASE_VAR_TYPE_UNSIGNED_INT, NULL, 0, +#if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN + MCA_BASE_VAR_FLAG_DEFAULT_ONLY, +#else + 0, +#endif + OPAL_INFO_LVL_3, /* NTH: temporarily disable add_procs on big-endian and 32-bit systems * until we have a better solution for handling proc sentinels. */ #if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN

a softer option is to force cutoff disabled on 32 bits / big endian archs

diff --git a/ompi/runtime/ompi_mpi_params.c b/ompi/runtime/ompi_mpi_params.c index c20562c..7383f6f 100644 --- a/ompi/runtime/ompi_mpi_params.c +++ b/ompi/runtime/ompi_mpi_params.c @@ -68,7 +68,7 @@ bool ompi_mpi_preconnect_mpi = false; /* NTH: temporarily disable add_procs on big-endian and 32-bit systems * until we have a better solution for handling proc sentinels. */ #if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN -#define OMPI_ADD_PROCS_CUTOFF_DEFAULT UINT_MAX +#define OMPI_ADD_PROCS_CUTOFF_DEFAULT UINT32_MAX #else #define OMPI_ADD_PROCS_CUTOFF_DEFAULT 0 #endif @@ -285,6 +285,9 @@ int ompi_mpi_register_params(void) MCA_BASE_VAR_SCOPE_LOCAL, #endif &ompi_add_procs_cutoff); +#if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN + ompi_add_procs_cutoff = UINT32_MAX; +#endif ompi_mpi_dynamics_enabled = true; (void) mca_base_var_register("ompi", "mpi", NULL, "dynamics_enabled",
@hjelmn
Copy link
Member Author

hjelmn commented Feb 10, 2016

Ah, yes. Forgot about the default only setting. Will update my PR.

-Nathan

On Feb 9, 2016, at 5:25 PM, Gilles Gouaillardet notifications@github.com wrote:

@hjelmn this does not disable cutoff, it only disables it by default (e.g. it is possible to re-enable it)
for example, on a 32 bits arch

$ OMPI_MCA_mpi_add_procs_cutoff=10 ~/local/ompi-master-m32/bin/ompi_info --all | grep add_procs_cutoff
MCA mpi: parameter "mpi_add_procs_cutoff" (current value: "10", data source: environment, level: 3 user/all, type: unsigned_int)

and running with
mpirun --mca mpi_add_procs_cutoff 0 ...
will cause a SIGSEGV

here is a patch that makes the value read-only (and cause mpirun to fail if it is forced)

diff --git a/ompi/runtime/ompi_mpi_params.c b/ompi/runtime/ompi_mpi_params.c

index c20562c..df55a8c 100644

--- a/ompi/runtime/ompi_mpi_params.c
+++ b/ompi/runtime/ompi_mpi_params.c
@@ -276,7 +276,13 @@
int ompi_mpi_register_params(void)
"Maximum world size for pre-allocating resources for all "
"remote processes. Increasing this limit may improve "
"communication performance at the cost of memory usage",

  •  MCA_BASE_VAR_TYPE_UNSIGNED_INT, NULL, 0, 0, OPAL_INFO_LVL_3, 
  •  MCA_BASE_VAR_TYPE_UNSIGNED_INT, NULL, 0, 

    +#if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN

  •  MCA_BASE_VAR_FLAG_DEFAULT_ONLY, 

    +#else

  •  0, 

    +#endif

  •  OPAL_INFO_LVL_3, 

    /* NTH: temporarily disable add_procs on big-endian and 32-bit systems

    • until we have a better solution for handling proc sentinels. */
      #if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN

a softer option is to force cutoff disabled on 32 bits / big endian archs

diff --git a/ompi/runtime/ompi_mpi_params.c b/ompi/runtime/ompi_mpi_params.c

index c20562c..7383f6f 100644

--- a/ompi/runtime/ompi_mpi_params.c
+++ b/ompi/runtime/ompi_mpi_params.c
@@ -68,7 +68,7 @@
bool ompi_mpi_preconnect_mpi = false;
/* NTH: temporarily disable add_procs on big-endian and 32-bit systems

  • until we have a better solution for handling proc sentinels. */
    #if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN

-#define OMPI_ADD_PROCS_CUTOFF_DEFAULT UINT_MAX
+#define OMPI_ADD_PROCS_CUTOFF_DEFAULT UINT32_MAX

#else
#define OMPI_ADD_PROCS_CUTOFF_DEFAULT 0
#endif

@@ -285,6 +285,9 @@
int ompi_mpi_register_params(void)
MCA_BASE_VAR_SCOPE_LOCAL,
#endif
&ompi_add_procs_cutoff);

+#if SIZEOF_VOID_P == 4 || BYTE_ORDER == BIG_ENDIAN

  • ompi_add_procs_cutoff = UINT32_MAX;
    +#endif

ompi_mpi_dynamics_enabled = true;
(void) mca_base_var_register("ompi", "mpi", NULL, "dynamics_enabled",


Reply to this email directly or view it on GitHub.

@jsquyres
Copy link
Member

@hjelmn @bosilca @ggouaillardet @rhc54 please confirm that this PR should be closed, and the solution from #1345 should be used for v2.0.0 instead (per this comment).

@bosilca
Copy link
Member

bosilca commented Feb 11, 2016

👍

@jsquyres
Copy link
Member

@bosilca Heh -- are you confirming my suggestion (that the solution to #1345 should be used), or that this one should be used?

@hjelmn
Copy link
Member Author

hjelmn commented Feb 11, 2016

I would say this should be closed and @ggouaillardet #1345 PR'd to 2.0.0. It allows us to keep dynamic add_procs on the affected platforms.

@jsquyres
Copy link
Member

Ok, I'll close this one. Can you guys open a PR for v2.0.0?

@jsquyres jsquyres closed this Feb 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment