Skip to content

Conversation

@will-cromar
Copy link
Collaborator

Implement private API for getting TPU memory allocation. Tweak old API because it's ugly.

Example:

>>> import torch_xla >>> import torch_xla.core.xla_model as xm >>> torch_xla._XLAC._xla_memory_info(str(xm.xla_device())) {'bytes_used': 214528, 'bytes_limit': 34088157184} >>> import jax >>> jax.devices()[0].memory_stats() {'num_allocs': 2, 'bytes_in_use': 214528, 'peak_bytes_in_use': 214528, 'largest_alloc_size': 201728, 'bytes_limit': 34088157184, 'bytes_reserved': 0, 'peak_bytes_reserved': 0, 'bytes_reservable_limit': 34088157184, 'largest_free_block_bytes': 34087942656} 

Creating a nice public API that rounds to a human-readable unit is left as an exercise to the reader.

@will-cromar will-cromar added usability Bugs/features related to improving the usability of PyTorch/XLA tpuci labels May 20, 2024
@will-cromar will-cromar requested a review from JackCaoG May 20, 2024 23:03
@will-cromar
Copy link
Collaborator Author

Weirdly this function is unimplemented for CPU, so put the test in the TPU CI script instead.

@JackCaoG
Copy link
Collaborator

xm.get_memory_info also works right? since it called that pybind.

@will-cromar
Copy link
Collaborator Author

xm.get_memory_info also works right? since it called that pybind.

Good catch. That didn't show up when I searched for usage in VSCode initially because I had a setting wrong. I'll update that docstring and git the return value a type.

@will-cromar will-cromar merged commit 5409cd5 into master May 21, 2024
qihqi pushed a commit that referenced this pull request May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

usability Bugs/features related to improving the usability of PyTorch/XLA

2 participants