深度学习环境配置及其注意事项
-
只建议安装Ubuntu18.04.1 LTS版本,因为LTS版本的原因,内核默认自动更新,会导致内核与显卡驱动不兼容的问题,默认4.5内核无故障,5.7内核(ubuntu18.04.5LTS)存在与显卡驱动的兼容性问题! -
推荐使用UEFI+GPT模式。使用diskgenius可以确认硬盘的分区和引导类型. -
windows下可以运行 msinfo32 可以查看.
禁用内核自动更新
dpkg --get-selections |grep linux-image
uname -a
sudo apt-get remove linux-image-5.3.0-42-generic linux-image-extra-5.3.0-42-generic
sudo apt-get purge linux-image-5.3.0-42-generic linux-image-extra-5.3.0-42-generic
sudo apt-mark hold linux-image-5.3.0-42-generic
sudo apt-mark hold linux-image-extra-5.3.0-42-generic
系统安装
- 使用软件rufus将Ubuntu系统写入u盘,安装即可.建议将Ubuntu安装nvme0n1p4,引导在nvme0n1p1,引导默认挂载Ubuntu的/boot/efi目录下此为注意事项.
- 写系统时候注意选择GPT,UEFI(非GCM),其余无任何注意事项.
系统迁移
本质上就是系统文件的拷贝问题,但是由于Ubuntu有每个文件有不同的权限的问题,所以不能简单的直接拷贝,需要保留对应权限结构的拷贝. 方式一如下: 进入Ubuntu live模式后
sudo dd if=/dev/ssda4 of=/dev/nvme0n1p4
sudo umount /dev/nvme0n1p4
sudo e2fsck -f /dev/nvme0n1p4
sudo resize2fs /dev/nvme0n1p4
uuidgen | xargs tune2fs /dev/nvme0n1p4 -U
sudo add-apt-repository ppa:yannubuntu/boot-repair
sudo apt-get update
sudo apt-get install -y boot-repair && boot-repair
方式二如下
- 通过timeshift备份,再还原到新硬盘,再修复引导的操作即可!
- 注意还原位置都是nvme0n0p4,修复引导的位置是nvme0n0p1!
环境搭建
CUDA核心部分
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt remove --autoremove nvidia-cuda-toolkit
sudo apt remove --autoremove nvidia-*
sudo apt update
sudo add-apt-repository ppa:graphics-drivers
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo bash -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list'
sudo bash -c 'echo "deb http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda_learn.list'
sudo apt update
sudo apt install cuda-10-1
sudo apt install libcudnn7
sudo vi ~/.profile
if [ -d "/usr/local/cuda-10.1/bin/" ]; then
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
fi
nvcc --version
"""
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Apr_24_19:10:27_PDT_2019
Cuda compilation tools, release 10.1, V10.1.168
"""
nvidia-smi
/sbin/ldconfig -N -v $(sed ‘s/:/ /’ <<< $LD_LIBRARY_PATH) 2>/dev/null | grep libcudnn
Nvidia-Docker 2.0
sudo apt-get remove docker docker-engine docker.io containerd runc
sudo rm -rf /var/lib/docker
sudo apt-get autoclean
sudo apt-get update
sudo apt install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
sudo apt update
apt-cache policy docker-ce
sudo apt install docker-ce
sudo systemctl status docker
sudo usermod -aG docker username
su - username
id -nG
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd
sudo docker run --runtime=nvidia --rm nvidia/cuda:10.1-base nvidia-smi
|