I installed the version of Cuda-8.0 and Tensorflow GPU on ubuntu 16.04. It worked just fine using the GPU. But suddenly he stopped using the GPU. I installed shadoworflow via pip and correctly executed the GPU version when it worked, and used the GPU initially.
The message I get when importing shadoworflow:
>>> import tensorflow as tf I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcudnn.so.5. LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/lib/x86_64-linux-gnu I tensorflow/stream_executor/cuda/cuda_dnn.cc:3517] Unable to load cuDNN DSO I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
So it is clear that he can even find the cuda library from LD_LIBRARY_PATH. But when I get the following output:
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUDA_ERROR_UNKNOWN I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: naman-pc I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: naman-pc I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 375.39.0 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:363] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 375.39 Tue Jan 31 20:47:00 PST 2017 GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) """ I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 375.39.0 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 375.39.0 Device mapping: no known devices. I tensorflow/core/common_runtime/direct_session.cc:257] Device mapping:
Therefore, he cannot find the GPU. nvidia-smi gives the following result:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.39 Driver Version: 375.39 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Graphics Device Off | 0000:01:00.0 On | N/A | | 23% 41C P8 11W / 250W | 337MiB / 11169MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 1005 G /usr/lib/xorg/Xorg 197MiB | | 0 2032 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 89MiB | | 0 30355 G compiz 37MiB | +-----------------------------------------------------------------------------+
I looked at other links in stackoverflow, but they mostly ask to check LD_LIBRARY_PATH or nvidia-smi. For me both are expected, therefore they cannot understand the problem.
EDIT: I tried installing cudnn 5 and putting it in LD_LIBRARY_PATH, tenorflow reads it successfully, but still the same error when creating a session.
source share