研发需求
研发组内各人共用一台服务器,但是均需要root权限 下面记录一台ubuntu机器从初始化到最终提供上述服务的关键步骤
挂载硬盘
参考:
https://www.cnblogs.com/mumuzifeng/p/13963043.html
安装docker
参考:
https://www.runoob.com/docker/ubuntu-docker-install.html
安装nvidia docker
参考:
https://zhuanlan.zhihu.com/p/88351963?from_voters_page=true
迁移docker
参考:
https://blog.csdn.net/u011420410/article/details/99845765
测试yolov5
报错:
ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory解决方法
解决:
apt-get update
apt-get install libglib2.0-dev
apt-get install libsm6
apt-get install git
报错:
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
参考:
https://felaim.blog.csdn.net/article/details/109318772?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-109318772-blog-101209718-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_paycolumn_v3&depth_1-utm_source=distribute.pc_relevant.none-task-blog-109318772-blog-101209718-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_paycolumn_v3&utm_relevant_index=1
build镜像
FROM pytorch/pytorch:1.11.0-cuda11.3-cudnn8-devel
RUN apt update && apt install -y openssh-server
RUN echo "PermitRootLogin yes" >> /etc/ssh/sshd_config
RUN echo "root:123" | chpasswd
RUN apt-get install -y vim
ADD init.sh /etc/profile.d/init.sh
ENTRYPOINT service ssh restart && bash
其中init.sh:
export PATH=/opt/conda/bin:$PATH
这里有个几个个细节:
- docker exec 进入容器和ssh进入容器都是root账号,echo $PATH 得到的内容却不一样,不知道是为何
- 无法通过systemctl启动ssh 只能写在ENTRYPOINT里,不知为何无法安装和使用systemctl
- 通过 ENV PATH=/opt/conda/bin:$PATH 修改容器内PATH无效,不知为何
添加账号并运行对应容器
TMPNAME=name1
TMPPORT1=20001
TMPPORT2=20002
docker run -idt --privileged=true --ipc=host --gpus all --name $TMPNAME -v /home/$TMPNAME:/root -p $TMPPORT1:22 -p $TMPPORT2:$TMPPORT2 pytorch/pytorch:1.11.0-cuda11.3-cudnn8-devel-ssh
登录指令:
ssh root@host -p 20001
|