- Notifications
You must be signed in to change notification settings - Fork 792
[SYCL][Docs] Add sycl_ext_oneapi_inter_process_communication #20018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
[SYCL][Docs] Add sycl_ext_oneapi_inter_process_communication #20018
Conversation
cae8ee3
to 14415fe
Compare 14415fe
to 89f495c
Compare 89f495c
to 5f443ab
Compare 5f443ab
to b7d2052
Compare b7d2052
to 07f4355
Compare 07f4355
to 7364b75
Compare This commit adds a new extension for inter-process communicable SYCL object handles. As part of the initial version of this extension, only inter-process communicable memory is exposed. Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
7364b75
to eec1fe5
Compare @intel/dpcpp-tools-reviewers & @intel/llvm-reviewers-runtime - This should be ready for review! |
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Show resolved Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation LGTM overall
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No flags from tools perspective (just one nit).
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
* An exception with the `errc::feature_not_supported` error code if device | ||
`dev` does not have `aspect::ext_oneapi_ipc_memory`. | ||
* An exception with the `errc::invalid` error code if the handle data | ||
`handle_data` has an unexpected number of bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the number of bytes is correct, but the content is garbage? Can we broaden this exception like this:
- An exception with the
errc::invalid
error code if the handle datahandle_data
does not represent a valid IPC handle for USM memory on this host system.
That would also cover the case when the handle had the wrong number of bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am reluctant to promise an error code if handle data is corrupt. The byte size is something we can check at a SYCL/UR level, but the validity of the data is up to UMF to decide and the API documentation doesn't specify which error is returned if the data is off. Maybe a precondition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should push this requirement down to the UMF and down to Level Zero if necessary. They should be able to define an error code for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open an issue against UMF, but I think it might be tricky. What happens if the data looks correct to UMF, but using it causes UB or invalid access in the implementation? I suppose that's something they can look into though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UMF shouldn't look at the handle content at all. It should just pass it to the underlying driver API, and that API should return an error code if the handle content is invalid. This assumes that each driver is able to diagnose an error in such a case, but that seems quite reasonable to me. For example, cudaIpcOpenMemHandle
is defined to return cudaErrorInvalidResourceHandle
in this case. I don't see an error code listed for Level Zero, but that seems like a bug. They should be able to return an error code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be some wrapper information in the UMF IPC handle, but I don't know the exact details. Maybe @vinser52 has some insight into this?
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
sycl/doc/extensions/experimental/sycl_ext_oneapi_inter_process_communication.asciidoc Outdated Show resolved Hide resolved
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
[_Note:_ The `open` function can be called multiple times on the same handle | ||
within the same process. The number of calls to the `close` function must be | ||
equal to the number of calls to the `open` function to free the memory pointer. | ||
_{endnote}_] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While looking at the Level Zero documentation, I saw that it says this:
Multiple calls to this function with the same IPC handle will return unique pointers
Therefore, I think we should change this note to say:
[Note: The
open
function can be called multiple times on the same handle
within the same process. Each call toopen
may return a unique pointer value
even for the same handle, therefore each call toopen
must have a matching call
toclose
.
{endnote}]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinser52 - Does UMF let the calls go through to Level Zero each time, or will repeat calls to open
return the same UMF handle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter what UMF currently does? Even if UMF "squashes" the repeated zeMemOpenIpcHandle
calls now, it might not in the future. The proposed wording gives us the freedom to implement it either way.
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
This commit adds a new extension for inter-process communicable SYCL object handles. As part of the initial version of this extension, only inter-process communicable memory is exposed.