Implement `ComputationClient::GetMemoryInfo` #7086

will-cromar · 2024-05-20T23:03:15Z

Implement private API for getting TPU memory allocation. Tweak old API because it's ugly.

Example:

>>> import torch_xla >>> import torch_xla.core.xla_model as xm >>> torch_xla._XLAC._xla_memory_info(str(xm.xla_device())) {'bytes_used': 214528, 'bytes_limit': 34088157184} >>> import jax >>> jax.devices()[0].memory_stats() {'num_allocs': 2, 'bytes_in_use': 214528, 'peak_bytes_in_use': 214528, 'largest_alloc_size': 201728, 'bytes_limit': 34088157184, 'bytes_reserved': 0, 'peak_bytes_reserved': 0, 'bytes_reservable_limit': 34088157184, 'largest_free_block_bytes': 34087942656}

Creating a nice public API that rounds to a human-readable unit is left as an exercise to the reader.

will-cromar · 2024-05-20T23:03:46Z

Weirdly this function is unimplemented for CPU, so put the test in the TPU CI script instead.

JackCaoG · 2024-05-20T23:13:24Z

xm.get_memory_info also works right? since it called that pybind.

will-cromar · 2024-05-20T23:22:38Z

xm.get_memory_info also works right? since it called that pybind.

Good catch. That didn't show up when I searched for usage in VSCode initially because I had a setting wrong. I'll update that docstring and git the return value a type.

Implement GetMemoryInfo

d9d0eaa

will-cromar added usability Bugs/features related to improving the usability of PyTorch/XLA tpuci labels May 20, 2024

will-cromar requested a review from JackCaoG May 20, 2024 23:03

remove struct bind

a956582

JackCaoG approved these changes May 20, 2024

View reviewed changes

will-cromar added 2 commits May 20, 2024 23:27

Use existing xla_model API

3afa7ca

device is not a string

e1a2f84

will-cromar merged commit 5409cd5 into master May 21, 2024

qihqi pushed a commit that referenced this pull request May 29, 2024

Implement ComputationClient::GetMemoryInfo (#7086)

93b35dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implement `ComputationClient::GetMemoryInfo` #7086

Implement `ComputationClient::GetMemoryInfo` #7086

Uh oh!

will-cromar commented May 20, 2024

will-cromar commented May 20, 2024

JackCaoG commented May 20, 2024

will-cromar commented May 20, 2024

Labels

2 participants

Uh oh!

Implement ComputationClient::GetMemoryInfo #7086

Implement ComputationClient::GetMemoryInfo #7086

Uh oh!

Conversation

will-cromar commented May 20, 2024

will-cromar commented May 20, 2024

JackCaoG commented May 20, 2024

will-cromar commented May 20, 2024

Labels

2 participants

Implement `ComputationClient::GetMemoryInfo` #7086

Implement `ComputationClient::GetMemoryInfo` #7086