4 KiB
Nvidia 470 datacenter drivers
Note: Desktop drivers and datacenter drivers are different.
Debian
This hasn't yet been tested. If you have tested it, please open a PR to update this section.
Add apt repos
You can skip this if you already have the repositories enabled
To add the non-free and contrib repos, edit /etc/apt/sources.list
and add non-free contrib
to the end of each line, like this:
deb http://deb.debian.org/debian/ bullseye main non-free contrib
deb-src http://deb.debian.org/debian/ bullseye main non-free contrib
Then, run apt update
Installation
To install the driver:
apt install nvidia-tesla-470-driver
And to install CUDA:
apt install nvidia-cuda-dev nvidia-cuda-toolkit
Links
Fedora
This guide uses the RPM Fusion repositories, and if you install CUDA, it uses Nvidia repositories as well. Note that this guide is only compatible with Fedora 35+, I'm not sure about RHEL versions.
Add RPM Fusion repository
You can skip this if you already have the repository installed.
To add the RPM Fusion repository:
# Add gpg key
sudo dnf install distribution-gpg-keys
sudo rpmkeys --import /usr/share/distribution-gpg-keys/rpmfusion/RPM-GPG-KEY-rpmfusion-free-fedora-$(rpm -E %fedora)
# Add repo with gpg check
sudo dnf --setopt=localpkg_gpgcheck=1 install https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm
Install Driver
First, update everything, and reboot if you're not on the latest kernel.
dnf update -y
Then, install the driver:
dnf install akmod-nvidia-470xx
Do not reboot yet.
Before rebooting, use top
or ps
to make sure there is no akmods
, cc*
, kthreadd
, or gcc*
process running (*
is either nothing or a number)—or anything using tons of CPU that you don't expect.
Note: nvidia-smi
and other tools are not included with the driver. For that, you need to install CUDA.
Install CUDA
Install packages needed for CUDA with:
export FEDORA_VERSION=$(rpm -E %fedora) # Nvidia's repo doesn't support Fedora 38 yet, so change this to 37 if you're on Fedora 38
dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora35/x86_64/cuda-fedora${FEDORA_VERSION}.repo
dnf clean all
dnf module disable nvidia-driver
dnf -y install cuda
Note: Don't re-enable nvidia-driver
Problems
Suspend Issues
I had issues with my K80 not working after being suspended. For example, torch.cuda.is_available()
would give an error and return False, rather than saying True.
To fix this, install xorg-x11-drv-nvidia-470xx-power
dnf install xorg-x11-drv-nvidia-470xx-power
CUDA is higher version than driver
Sometimes the driver in the CUDA repo, and therefore dependencies for CUDA are of a later version than the driver. To fix this, run:
dnf module enable nvidia-driver -y && dnf download cuda-drivers && dnf module disable nvidia-driver -y
rpm -Uvh cuda-drivers*.rpm --nodeps
dnf update
More stuff
Why not install xorg-x11-drv-nvidia-470xx
?
- That's the display driver, not the data center driver. It is the same version number, but is not the same.