Skip to content

Conversation

mentOS31
Copy link

What
When non-blocking or persistent request is failed, MPI_Wait() aborts with SIGSEGV because .req_mpi_object.comm is NULL.
The ompi_errhandler_request_invoke() from the MPI_Wait() causes SIGSEGV by accessing .req_mpi_object.comm->error_handler .

How
Initialize .req_mpi_object.comm in COLL_UCC_GET_REQ() and COLL_UCC_GET_REQ_PERSISTENT() in coll_ucc_common.h.

devreal
devreal previously approved these changes Oct 15, 2025
bosilca
bosilca previously approved these changes Oct 15, 2025
@bosilca
Copy link
Member

bosilca commented Oct 15, 2025

This requirements (aka req->req_mpi_object being properly set) is well documented in the DEVEL.FT-REQUIREMENTS.md

@mentOS31 mentOS31 dismissed stale reviews from bosilca and devreal via 109ab06 October 16, 2025 05:02
@bosilca
Copy link
Member

bosilca commented Oct 16, 2025

Please squash.

Signed-off-by: Kento Hasegawa <hasegawa.kento@fujitsu.com>
@mentOS31
Copy link
Author

Thank you for your comment and reviews.
I have completed the squash. Thank you.

@bosilca bosilca merged commit ad363d3 into open-mpi:main Oct 17, 2025
16 checks passed
@bosilca
Copy link
Member

bosilca commented Oct 17, 2025

Would it be possible to create the corresponding PR for 5.x ?

@mentOS31
Copy link
Author

Yes, we will create the corresponding PR for the 5.x version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants