You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ReadyForReview] Auto3DSeg DataAnalyzer OOM and other minor issue (Project-MONAI#5278)
FixesProject-MONAI#5277 . ### Updated results In my local test env, I have the following results: - The change of GPU memory before/after DataAnalyzer is less than 5MB after the fix. Previously, there are lots of cached PyTorch tensors and CuPy variables that are not released for trainings that takes up to several GBs of GPU mem. - DataAnalyzer can also process larger images now because leaks are fix (3D image with a size 512x512x512 passed for 12GB RTX 3080Ti) ### Description Auto3DSeg DataAnalyzer occupied a large trunk of memory and was unable to release them during the training. The reasons behind are possibly due to: - Training are done by subprocess call, and PyTorch in the subprocess is unable to find the memory pool allocated by the main process - GPU memory leakage ( DataAnalyzer math operations uses torch functions and CuPy) plus test functions need improvements and AutoRunner needs to expose the API call to change device of DataAnalyzer ### Types of changes <!--- Put an `x` in all the boxes that apply, and remove the not applicable items --> - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Integration tests passed locally by running `./runtests.sh -f -u --net --coverage`. - [ ] Quick tests passed locally by running `./runtests.sh --quick --unittests --disttests`. - [ ] In-line docstrings updated. - [ ] Documentation updated, tested `make html` command in the `docs/` folder. Signed-off-by: Mingxin Zheng <18563433+mingxin-zheng@users.noreply.github.com>
0 commit comments