Installing CUDA support for tensorflow using Ubuntu 20.04 repository


What is CUDA?

CUDA is a Nvidia parallel computing platform and API that allows you to use your GPU to do some general processing (use your GPU to do some AI stuff).

How do I use it?

I have decided to post this after banging my head against the wall troubleshooting. So, if you have a Nvidia card, check if it supports CUDA in the official Nvidia page. If yours is supported, read on.

Versions are important!

By the time of writing, I am using Ubuntu 20.04 (KDE Neon), and I have the Nvidia driver installed from the Ubuntu repository. Since I don't want to add an external PPA, I have decided to install the CUDA support using the packages available in the repository:

  sudo apt update
  sudo apt install nvidia-cuda-toolkit

Now, this information is really important and I have not seen it in any tutorial online. The current CUDA version is 11, but the one currently available in Ubuntu 20.04 is 10.1.

You'll also need to install cuDNN, but this one we'll have to download from Nvidia.

The thing is that tensorflow requires specific versions of CUDA and cuDNN to work.

Check this table and make sure to download the cuDNN that matches your CUDA version, later, using a Python virtual environment, we will install the tensorflow version that matches the CUDA and cuDNN versions.

But first, extract the cuDNN Library (I've just extracted it in my Downloads folder). Than, move the following files to the corresponding system directories, and adjust the permissions:

  cd ~/Downloads
  sudo cp cuda/include/cudnn.h /usr/lib/cuda/include/
  sudo cp cuda/lib64/libcudnn* /usr/lib/cuda/lib64/
  sudo chmod a+r /usr/lib/cuda/include/cudnn.h /usr/lib/cuda/lib64/libcudnn*

We also need to export some environment variables with the correct path to the CUDA libraries:

  echo 'export LD_LIBRARY_PATH=/usr/lib/cuda/include:$LD_LIBRARY_PATH' >> ~/.bashrc

Now we can install the specific version of tensorflow (remember to use a Python virtual environment!).

  pip3 install tensorflow==2.3.0

Testing

If all wen well, typing nvidia-smi or nvcc --version should report that =CUDA+ is installed.

Activate your Python environment and launch the console. Executing the following lines should not result in any error message:

 import tensorflow as tf
 tf.config.list_physical_devices("GPU")

That's it.