因为windows使用较多,所以想在上面装一个pytorch环境进行学习,之前我根据个人笔记本电脑显卡型号已成功安装了显卡驱动和CUDA,安装最新版pytorch(1.9)也可以调用GPU,后面直接拿YOLOX跑了一下,发现问题还是比较多,但所幸都解决了,在这里除了记录一下个人经验,也给大家分享一下踩坑日记,给有兴趣的伙伴减少摸索时间,提高效率。
一、首先是在Win10上成功调用GPU,我的相关软件安装过程参考博客:
https://blog.csdn.net/qq_44442727/article/details/119923070 调用gpu是深度学习需要解决的最基层环境,解决好了底层环境,可以进一步搭建深度学习框架的环境,解决好了这些环境,相关的实验就可以直接进行。 安装好了的验证情况如下:
二、YOLOX环境安装
安装过程参考官网:https://github.com/Megvii-BaseDetection/YOLOX 官方代码如下:
git clone git@github.com:Megvii-BaseDetection/YOLOX.git
cd YOLOX
pip3 install -U pip && pip3 install -r requirements.txt
pip3 install -v -e .
pip3 install cython; pip3 install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
我的安装过程:
我是直接下载的压缩文件,因为windows系统拿来跑网络用的少,有些库和功能包不全,git出问题了,pycocotools也是下载的压缩包,解压缩后的位置如图:
解压缩后就是编译环境,我的命令行:
pip install -r requirements.txt
pip install -v -e .
我的win系统pip3有问题,换成pip可以运行,下载好了压缩包,主要就是环境安装,这两句指令主要是安装一些依赖环境,注意这里有个坑,对我来说是坑,就是第二句编译指令,没事,仍然正常运行,后面可以纠正。 接下来就是yolox和pycocotools的环境编译,分别进入到各自文件夹的setup.py文件位置,分别执行指令:
python setup.py install
这时根据你的环境情况来决定编译情况,如果环境有缺陷,编译就会报各种各样的错,我就在编译这方面耗了不少时间,但是最后总算是成功了。如果你能直接编译成功,那就非常幸运和高效了。我的编译踩坑记录见博客:https://blog.csdn.net/qq_44442727/article/details/119938605 后面,我又发现编译还可以用另一条指令:
python setup.py develop
和 python setup.py install 有一些区别,都可以起到编译效果
install 主要是装一些第三方库,不需要你编辑、修改的库,在这里,pycocotools显然是可以用它的。 develop 是针对你的环境需要进行修改,比如改配置、参数什么的,用这种方式就相当于建立了一种软连接,会即时更新你的修改内容。就相当于,develop是编译大框架的,框架中的配置会有修改需求,而install是大框架中的一个不需要修改的库、包,只需要调用就行。 所以,到这里你就知道了,yolox的setup.py应该用develop编译,而pycocotools应该两种都可以。 同时python setup.py develop 也要和 pip install -v -e . 区分一下,因为在这里就有问题了,我开始使用的pip install -v -e . 使用python setup.py install编译环境通过后,运行代码总出错,可能是yolox用的python setup.py install导致,后面用python setup.py develop 就直接出结果了。所以建议大家少用pip install -v -e . 说不定什么时候就出问题了。 当我用python setup.py install把两个环境编译通过后,运行示例代码,官方预测demo.py代码不能直接跑,还需要简单设置一下,下载已经跑完的模型,我下的是yolox_s.pth,运行代码如下
python tools/demo.py image -n yolox-s -c configs/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device gpu
还没有运行python setup.py develop的运行结果如下,总是报错:
(pytorch) D:\code\pytorch\YOLOX>python tools/demo.py image -n yolox-s -c configs/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [gpu]
Traceback (most recent call last):
File "D:\soft\anaconda\envs\pytorch\lib\site-packages\yolox-0.1.0-py3.7-win-amd64.egg\yolox\exp\build.py", line 13, in get_exp_by_file
current_exp = importlib.import_module(os.path.basename(exp_file).split(".")[0])
File "D:\soft\anaconda\envs\pytorch\lib\importlib\__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'yolox_s'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/demo.py", line 306, in <module>
exp = get_exp(args.exp_file, args.name)
File "D:\soft\anaconda\envs\pytorch\lib\site-packages\yolox-0.1.0-py3.7-win-amd64.egg\yolox\exp\build.py", line 53, in get_exp
return get_exp_by_name(exp_name)
File "D:\soft\anaconda\envs\pytorch\lib\site-packages\yolox-0.1.0-py3.7-win-amd64.egg\yolox\exp\build.py", line 35, in get_exp_by_name
return get_exp_by_file(exp_path)
File "D:\soft\anaconda\envs\pytorch\lib\site-packages\yolox-0.1.0-py3.7-win-amd64.egg\yolox\exp\build.py", line 16, in get_exp_by_file
raise ImportError("{} doesn't contains class named 'Exp'".format(exp_file))
ImportError: D:\soft\anaconda\envs\pytorch\lib\site-packages\yolox-0.1.0-py3.7-win-amd64.egg\exps\default\yolox_s.py doesn't contains class named 'Exp'
后面我把编译好了的yolox卸载了,再使用python setup.py develop编译了一下,结果就正常了
(pytorch) D:\code\pytorch\YOLOX>pip uninstall yolox
Found existing installation: yolox 0.1.0
Uninstalling yolox-0.1.0:
Would remove:
d:\soft\anaconda\envs\pytorch\lib\site-packages\yolox-0.1.0-py3.7-win-amd64.egg
Proceed (y/n)? y
Successfully uninstalled yolox-0.1.0
(pytorch) D:\code\pytorch\YOLOX>python3 setup.py develop
(pytorch) D:\code\pytorch\YOLOX>python setup.py develop
running develop
running egg_info
writing yolox.egg-info\PKG-INFO
writing dependency_links to yolox.egg-info\dependency_links.txt
writing top-level names to yolox.egg-info\top_level.txt
reading manifest file 'yolox.egg-info\SOURCES.txt'
writing manifest file 'yolox.egg-info\SOURCES.txt'
running build_ext
D:\soft\anaconda\envs\pytorch\lib\site-packages\torch\utils\cpp_extension.py:312: UserWarning:
!! WARNING !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (cl 19.00.24210) may be ABI-incompatible with PyTorch!
Please use a compiler that is ABI-compatible with GCC 5.0 and above.
See https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html.
See https://gist.github.com/goldsborough/d466f43e8ffc948ff92de7486c5216d6
for instructions on how to install GCC 5 or higher.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! WARNING !!
warnings.warn(ABI_INCOMPATIBILITY_WARNING.format(compiler))
building 'yolox._C' extension
Emitting ninja build file D:\code\pytorch\YOLOX\build\temp.win-amd64-3.7\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:D:\soft\anaconda\envs\pytorch\lib\site-packages\torch\lib /LIBPATH:D:\soft\anaconda\envs\pytorch\libs /LIBPATH:D:\soft\anaconda\envs\pytorch\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\LIB\amd64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.20348.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.20348.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit__C D:\code\pytorch\YOLOX\build\temp.win-amd64-3.7\Release\code\pytorch\YOLOX\yolox\layers\csrc\vision.obj D:\code\pytorch\YOLOX\build\temp.win-amd64-3.7\Release\code\pytorch\YOLOX\yolox\layers\csrc\cocoeval\cocoeval.obj /OUT:build\lib.win-amd64-3.7\yolox\_C.cp37-win_amd64.pyd /IMPLIB:D:\code\pytorch\YOLOX\build\temp.win-amd64-3.7\Release\code\pytorch\YOLOX\yolox\layers\csrc\_C.cp37-win_amd64.lib
vision.obj : warning LNK4197: export 'PyInit__C' specified multiple times; using first specification
Creating library D:\code\pytorch\YOLOX\build\temp.win-amd64-3.7\Release\code\pytorch\YOLOX\yolox\layers\csrc\_C.cp37-win_amd64.lib and object D:\code\pytorch\YOLOX\build\temp.win-amd64-3.7\Release\code\pytorch\YOLOX\yolox\layers\csrc\_C.cp37-win_amd64.exp
Generating code
Finished generating code
copying build\lib.win-amd64-3.7\yolox\_C.cp37-win_amd64.pyd -> yolox
Creating d:\soft\anaconda\envs\pytorch\lib\site-packages\yolox.egg-link (link to .)
Adding yolox 0.1.0 to easy-install.pth file
Installed d:\code\pytorch\yolox
Processing dependencies for yolox==0.1.0
Finished processing dependencies for yolox==0.1.0
运行python setup.py develop后,yolox也编译成功了,但和之前的编译结果也有区别,pip list 可以看到: develop相当于有一个映射关系,有一个对应位置,可以实时调整,而install是没有位置对应的,说明只编译当时的环境,变化后就会报错,运行成功的结果如下:
(pytorch) D:\code\pytorch\YOLOX>python tools/demo.py image -n yolox-s -c configs/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device gpu
2021-08-26 21:18:51.244 | INFO | __main__:main:252 - Args: Namespace(camid=0, ckpt='configs/yolox_s.pth', conf=0.25, demo='image', device='gpu', exp_file=None, experiment_name='yolox_s', fp16=False, fuse=False, legacy=False, name='yolox-s', nms=0.45, path='assets/dog.jpg', save_result=True, trt=False, tsize=640)
D:\soft\anaconda\envs\pytorch\lib\site-packages\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ..\c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
2021-08-26 21:18:52.268 | INFO | __main__:main:262 - Model Summary: Params: 8.97M, Gflops: 26.81
2021-08-26 21:19:06.085 | INFO | __main__:main:273 - loading checkpoint
2021-08-26 21:19:07.861 | INFO | __main__:main:277 - loaded checkpoint done.
2021-08-26 21:19:24.753 | INFO | __main__:inference:162 - Infer time: 16.8251s
2021-08-26 21:19:24.849 | INFO | __main__:image_demo:199 - Saving detection result in ./YOLOX_outputs\yolox_s\vis_res\2021_08_26_21_19_07\dog.jpg
这只是,测试demo,至于训练还没有尝试,因为显卡也不大,内存也不大,估计不太会拿来训练,后面有机会就拿小样本摸索一下怎么训练起来,训练还需要安装一些其他的库,总之,个人建议还是使用Ubuntu系统比较方便一点,安装环境不太容易出问题,有问题解决起来也方便。
关于apex: 没试能不能使
|