PyTorch utilize CPU instead of GPU

vvodan · June 28, 2020, 8:09pm

When I try to use CUDA for training NN or just for simple calculation, PyTorch utilize CPU instead of GPU

Python 3.8.3 (default, Jun 25 2020, 23:21:14) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> torch.cuda.is_available() True >>> device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") >>> device device(type='cuda', index=0) >>> tensor = torch.rand(1, 1, 10).to(device) >>> tensor tensor([[[0.1126, 0.1737, 0.9678, 0.8833, 0.6923, 0.2118, 0.9874, 0.9397, 0.4831, 0.4274]]], device='cuda:0') >>> tensor_two = tensor + tensor >>> tensor_two tensor([[[0.2252, 0.3474, 1.9356, 1.7666, 1.3847, 0.4236, 1.9747, 1.8794, 0.9661, 0.8549]]], device='cuda:0') >>> while True: ... tensor_two = tensor + tensor ...

nvidia-smi output under Windows:

Sun Jun 28 10:46:36 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 455.41 Driver Version: 455.41 CUDA Version: 11.1 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 1070 WDDM | 00000000:01:00.0 On | N/A | | 44% 39C P2 33W / 151W | 1394MiB / 8192MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 4 C Insufficient Permissions N/A | | 0 N/A N/A 1268 C+G ...lPanel\SystemSettings.exe N/A | | 0 N/A N/A 1468 C+G C:\Windows\System32\dwm.exe N/A | | 0 N/A N/A 3784 C+G ...bbwe\Microsoft.Photos.exe N/A | | 0 N/A N/A 8844 C+G ...5n1h2txyewy\SearchApp.exe N/A | | 0 N/A N/A 8892 C+G ...artMenuExperienceHost.exe N/A | | 0 N/A N/A 10380 C+G ...ropbox\Client\Dropbox.exe N/A | | 0 N/A N/A 14504 C+G ...y\ShellExperienceHost.exe N/A | | 0 N/A N/A 15984 C+G ...8bbwe\WindowsTerminal.exe N/A | | 0 N/A N/A 19380 C+G ...2txyewy\TextInputHost.exe N/A | | 0 N/A N/A 28380 C+G ...b3d8bbwe\WinStore.App.exe N/A | +-----------------------------------------------------------------------------+

Task manager:

WSL version:
Linux version 4.19.121-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Fri Jun 19 21:06:10 UTC 2020

Ubuntu 20.04 LTS

DxDiag.txt (114.6 KB)

The same code on Ubuntu PC:

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 108... Off | 00000000:01:00.0 Off | N/A | | 0% 46C P2 63W / 250W | 663MiB / 11178MiB | 19% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX 108... Off | 00000000:02:00.0 Off | N/A | | 0% 39C P8 11W / 250W | 12MiB / 11178MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 GeForce GTX 108... Off | 00000000:04:00.0 Off | N/A | | 0% 43C P8 12W / 250W | 12MiB / 11178MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 GeForce GTX 108... Off | 00000000:05:00.0 Off | N/A | | 0% 43C P8 13W / 250W | 12MiB / 11178MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 942 G /usr/lib/xorg/Xorg 25MiB | | 0 1239 G /usr/bin/gnome-shell 57MiB | | 0 5591 C ...vlad/.pyenv/versions/pytorch/bin/python 567MiB | +-----------------------------------------------------------------------------+

rousseau.f · June 29, 2020, 4:23pm

Same thing here. I use a surface book 2, linux kernel 4.19.121, ubuntu and miniconda.
While the GPU is detected by pytorch, it is not used during training.

vvodan · July 9, 2020, 6:28pm

Tensorflow docker container doesn’t utilize GPU too.

Train on 60000 samples Epoch 1/10 60000/60000 [==============================] - 39s 642us/sample - loss: 0.4938 - accuracy: 0.8272 Epoch 2/10 60000/60000 [==============================] - 27s 447us/sample - loss: 0.3753 - accuracy: 0.8636 Epoch 3/10 60000/60000 [==============================] - 26s 436us/sample - loss: 0.3360 - accuracy: 0.8761 Epoch 4/10 60000/60000 [==============================] - 27s 458us/sample - loss: 0.3117 - accuracy: 0.8863 Epoch 5/10 60000/60000 [==============================] - 33s 547us/sample - loss: 0.2945 - accuracy: 0.8915 Epoch 6/10 60000/60000 [==============================] - 34s 571us/sample - loss: 0.2815 - accuracy: 0.8951 Epoch 7/10 60000/60000 [==============================] - 37s 615us/sample - loss: 0.2700 - accuracy: 0.8998 Epoch 8/10 60000/60000 [==============================] - 38s 626us/sample - loss: 0.2591 - accuracy: 0.9039 Epoch 9/10 60000/60000 [==============================] - 39s 643us/sample - loss: 0.2498 - accuracy: 0.9059 Epoch 10/10 60000/60000 [==============================] - 36s 595us/sample - loss: 0.2404 - accuracy: 0.9105 [W 18:12:36.300 NotebookApp] Notebook tensorflow-tutorials/classification.ipynb is not trusted [I 18:12:38.682 NotebookApp] Kernel started: 9eb86799-5559-4d34-ba36-7edd178525c4 2020-07-09 18:12:53.102083: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6 2020-07-09 18:12:53.103647: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6 2020-07-09 18:13:51.119733: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2020-07-09 18:13:51.220018: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:51.220439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1070 computeCapability: 6.1 coreClock: 1.683GHz coreCount: 15 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s 2020-07-09 18:13:51.220576: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-07-09 18:13:51.220709: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2020-07-09 18:13:51.222465: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2020-07-09 18:13:51.223247: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2020-07-09 18:13:51.225612: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2020-07-09 18:13:51.228846: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2020-07-09 18:13:51.229013: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-07-09 18:13:51.230293: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:51.231493: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:51.231930: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0 2020-07-09 18:13:51.240228: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3410010000 Hz 2020-07-09 18:13:51.241999: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56b3460 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-09 18:13:51.242058: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-07-09 18:13:51.689421: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:51.689939: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56465a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2020-07-09 18:13:51.689973: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1070, Compute Capability 6.1 2020-07-09 18:13:51.691056: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:51.691691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 1070 computeCapability: 6.1 coreClock: 1.683GHz coreCount: 15 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s 2020-07-09 18:13:51.691838: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-07-09 18:13:51.691882: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2020-07-09 18:13:51.691952: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2020-07-09 18:13:51.691996: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2020-07-09 18:13:51.692096: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2020-07-09 18:13:51.692196: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2020-07-09 18:13:51.692287: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2020-07-09 18:13:51.695324: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:51.697101: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:51.697587: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0 2020-07-09 18:13:51.697738: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2020-07-09 18:13:52.656161: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-07-09 18:13:52.656231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0 2020-07-09 18:13:52.656282: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N 2020-07-09 18:13:52.657540: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:52.657991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1324] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2020-07-09 18:13:52.659207: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node Your kernel may have been built without NUMA support. 2020-07-09 18:13:52.660162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6835 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1) 2020-07-09 18:14:10.154011: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 6.67G (7167590400 bytes) from device: CUDA_ERROR_UNKNOWN: unknown error 2020-07-09 18:14:25.532223: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 6.01G (6450831360 bytes) from device: CUDA_ERROR_UNKNOWN: unknown error [I 18:14:38.780 NotebookApp] Saving file at /tensorflow-tutorials/classification.ipynb 2020-07-09 18:14:39.436606: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 5.41G (5805748224 bytes) from device: CUDA_ERROR_UNKNOWN: unknown error 2020-07-09 18:15:30.640365: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

kmorozov · July 17, 2020, 3:48pm

Tensorflow docker containers should work. Do you mind to describe the exact container you tried to run (the actual docker command line) and how did you determin it wasn’t utilizing the GPU?

vvodan · July 17, 2020, 7:23pm

I ran commands from wsl user guide.
Docker command line:
docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter

CPU/GPU utilization was monitored with task manager and nvidia-smi under Windows.

cherrerajobs · November 25, 2020, 1:57am

I just ran the above docker command and checked my GPU usage when running the samples, it stayed at 0% throughout:

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 465.12 Driver Version: 465.12 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 3080 WDDM | 00000000:40:00.0 Off | N/A | | 0% 60C P2 111W / 340W | 9764MiB / 10240MiB | 2% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

I’m hoping to be able to use Pytorch with Cuda on Windows in the future.

Offtopic: When I run the python commands @vvodan posted in the first post I get the same results, pytorch can see the GPU and enables CUDA without actually using it even though I’m not using a docker to run python in. (Just Anaconda) Does this mean I have access to the gpu without running a docker? If not, does that mean I have to use a prebuilt docker image to eventually speed up ML?

–Chris

Topic		Replies	Views
Tensor Core Usage on WSL2 with RTX 3080 Laptop GPU CUDA on Windows Subsystem for Linux	1	2209	February 27, 2022
CUDA Containers do not work, yet torch detect GPU successfully CUDA on Windows Subsystem for Linux	0	521	January 4, 2021
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use 'nvidia-docker run' to start this container; Docker and NVIDIA Docker pytorch	0	3982	March 24, 2022
Pytorch use CUDA error Linux	0	1521	November 6, 2017
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver CUDA on Windows Subsystem for Linux	33	23399	May 1, 2021
No GPUs are available in WSL2+Pytorch on Windows 10 (21H2) CUDA on Windows Subsystem for Linux cuda , pytorch , wsl	1	2788	February 4, 2022
Low performance in Pytorch on WSL2 with 460.x drivers and cuda 11 in Docker (Pytorch Bottle Profile Included!) CUDA on Windows Subsystem for Linux	0	1618	September 11, 2020
WSL2 / RTX3070 : running cuda samples and containers errors CUDA on Windows Subsystem for Linux	0	2654	January 27, 2021
TF and Pytorch are slower on Windows than on linux CUDA Programming and Performance	7	3275	July 2, 2019
Using torch with nvidia docker CUDA Programming and Performance	1	894	November 15, 2016

PyTorch utilize CPU instead of GPU

Related topics