概念介绍
- NVIDIA 驱动:显卡驱动,安装后可以使用 nvidia-smi 命令。
- CUDA:CUDA是一个并行计算平台和编程模型,能够使得使用GPU进行通用计算变得简单和优雅。可以看作编程语言,也可以看作 API。
- 运行CUDA应用程序要求系统至少具有一个具有CUDA功能的GPU和与CUDA Toolkit兼容的驱动程序。
- CUDA Toolkit (NVIDIA): CUDA完整的工具安装包,其中提供了 Nvidia 驱动程序、开发 CUDA 程序相关的开发工具包等可供安装的选项。包括 CUDA 程序的编译器、IDE、调试器等,CUDA 程序所对应的各式库文件以及它们的头文件。
- CUDA Toolkit (Pytorch): CUDA不完整的工具安装包,其主要包含在使用 CUDA 相关的功能时所依赖的动态链接库。不会安装驱动程序。只能运行编译好的 CUDA 程序。
- NVCC:CUDA的编译器,只是 CUDA Toolkit 中的一部分。
- cuDNN:为深度学习计算设计的软件库。是NVIDIA专门针对深度神经网络中的基础操作而设计基于GPU的加速库。cuDNN为深度神经网络中的标准流程提供了高度优化的实现方式,例如convolution、pooling、normalization以及activation layers的前向以及后向过程。
CUDA 环境安装
服务器基础环境安装
基础软件
sudo apt-get update
sudo apt-get upgrade -y
sudo apt-get install net-tools -y
sudo apt-get install tree -y
sudo apt-get install vim -y
sudo apt-get install gcc -y
sudo apt-get install g++ -y
SSH 和 SFTP
sudo apt-get install openssh-server -y
sudo apt-get install vsftpd
sudo vim /etc/vsftpd.conf
'''
local_enable=YES
write_enable=YES
'''
sudo /etc/init.d/vsftpd restart
Anaconda
wget https://repo.anaconda.com/archive/Anaconda3-5.3.1-Linux-x86_64.sh
chmod +x Anaconda3-5.3.1-Linux-x86_64.sh
bash Anaconda3-5.3.1-Linux-x86_64.sh
sudo vim ~/.bashrc
'''
export PATH="/home/用户名/anaconda3/bin:$PATH"
'''
source ~/.bashrc
conda config --set ssl_verify false
替换 apt 源
sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak
sudo rm -rf /etc/apt/sources.list
sudo vim /etc/apt/sources.list
'''
deb http://mirrors.aliyun.com/ubuntu/ trusty main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ trusty-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ trusty-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ trusty-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ trusty-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ trusty main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ trusty-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ trusty-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ trusty-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ trusty-backports main restricted universe multiverse
'''
sudo apt-get update
Git
sudo apt-get install git -y
ssh-keygen
cat /home/用户名/.ssh/id_rsa.pub
NVIDIA 驱动安装
- 添加 nvidia repository
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
- 选择驱动版本并安装
sudo ubuntu-drivers devices
显示可用的驱动版本,例如:
driver : nvidia-410 - third-party free
driver : nvidia-415 - third-party free
driver : nvidia-418 - third-party free
driver : nvidia-384 - distro non-free
driver : nvidia-430 - third-party free recommended
driver : xserver-xorg-video-nouveau - distro free builtin
安装选定的版本,410版本为例,或者自动安装推荐版本
sudo ubuntu-drivers autoinstall
sudo apt install nvidia-410
- 重启,然后通过用如下命令查看显卡信息
nvidia-smi
watch -n 1 nvidia-smi
nvidia-smi -l 1
不建议使用watch查看nvidia-smi,watch每个时间周期开启一个进程(PID),查看后关闭进程,会影响cuda操作,如cudaMalloc;建议使用nvidia-smi -l x或者nvidia-smi --loop=xxx代替,这个命令执行期间一直是一个进程PID。
CUDA-toolkit 安装
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda-repo-ubuntu1804-11-6-local_11.6.2-510.47.03-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-6-local_11.6.2-510.47.03-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu1804-11-6-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda
reboot
nvidia-smi
vim ~/.bashrc
'''
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
'''
source ~/.bashrc
nvcc -V
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=18.04&target_type=deb_network
cuDNN 安装
wget https://developer.nvidia.com/compute/cudnn/secure/8.4.0/local_installers/10.2/cudnn-local-repo-ubuntu1804-8.4.0.27_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu1804-8.4.0.27_1.0-1_amd64.deb
https://developer.nvidia.com/rdp/cudnn-download
|