Hi everybody,
To be honest, I’m a newbie. I’m trying to install every package on Nvidia that is the requirement of Tensorflow, the requirements include CUDA and cuDNN.
My CUDA version is 13.0, cuDNN version is 9.13.1, and Tensorflow is 2.20.0!
I tried it many times, but it always shows me something like this:
```
python3 bitcoin.py
2025-11-03 14:10:21.838224: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2025-11-03 14:10:21.882288: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-11-03 14:10:22.887128: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2.20.0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1762179023.723233 106736 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
[PhysicalDevice(name=‘/physical_device:GPU:0’, device_type=‘GPU’)]
/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an input_shape/input_dim argument to a layer. When using Sequential models, prefer using an Input(shape) object as the first layer in the model instead.
super().init(**kwargs)
W0000 00:00:1762179024.002837 106736 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1762179024.156613 106736 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13270 MB memory: → device: 0, name: NVIDIA GeForce RTX 5060 Ti, pci bus id: 0000:01:00.0, compute capability: 12.0
2025-11-03 14:10:24.577763: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] ‘cuModuleLoadData(&module, data)’ failed with ‘CUDA_ERROR_INVALID_PTX’
2025-11-03 14:10:24.577810: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] ‘cuModuleGetFunction(&function, module, kernel_name)’ failed with ‘CUDA_ERROR_INVALID_HANDLE’
2025-11-03 14:10:24.577846: W tensorflow/core/framework/op_kernel.cc:1842] INTERNAL: ‘cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast(stream), params, nullptr)’ failed with ‘CUDA_ERROR_INVALID_HANDLE’
2025-11-03 14:10:24.577881: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: INTERNAL: ‘cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast(stream), params, nullptr)’ failed with ‘CUDA_ERROR_INVALID_HANDLE’
Traceback (most recent call last):
File “/home/crush_dpl/cs50p/week4/bitcoin/bitcoin.py”, line 14, in
tf.keras.layers.Dropout(0.2),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/layers/regularization/dropout.py”, line 53, in init
self.seed_generator = backend.random.SeedGenerator(seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/random/seed_generator.py”, line 87, in init
self.state = self.backend.Variable(
^^^^^^^^^^^^^^^^^^^^^^
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/backend/common/variables.py”, line 206, in init
self._initialize_with_initializer(initializer)
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/backend/tensorflow/core.py”, line 52, in _initialize_with_initializer
self._initialize(lambda: initializer(self._shape, dtype=self._dtype))
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/backend/tensorflow/core.py”, line 42, in _initialize
self._value = tf.Variable(
^^^^^^^^^^^^
File “/home/crush_dpl/.local/lib/python3.12/site-packages/tensorflow/python/util/traceback_utils.py”, line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/backend/tensorflow/core.py”, line 52, in
self._initialize(lambda: initializer(self._shape, dtype=self._dtype))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/random/seed_generator.py”, line 84, in seed_initializer
return self.backend.convert_to_tensor([seed, 0], dtype=dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/crush_dpl/.local/lib/python3.12/site-packages/keras/src/backend/tensorflow/core.py”, line 152, in convert_to_tensor
return tf.cast(x, dtype)
^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InternalError: {{function_node _wrapped__Cast_device/job:localhost/replica:0/task:0/device:GPU:0}} ‘cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast(stream), params, nullptr)’ failed with ‘CUDA_ERROR_INVALID_HANDLE’ [Op:Cast] name:
```
This is my code, just a simple one:
import tensorflow as tf print(tf.__version__) print(tf.config.list_physical_devices('GPU')) mnist = tf.keras.datasets.mnist (x_train, y_train),(x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation='softmax') ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.fit(x_train, y_train, epochs=5) model.evaluate(x_test, y_test) What should I do to fix this? Or should I downgrade my cuDNN and CUDA?
My thanks!