PyTorch-1.8 、1.9 conda方式自定义安装教程

1. conda安装

1.1 复制Anaconda安装文件

Anaconda安装文件在公共目录:

/public/software/apps/DeepLearning/whl/Anaconda/Anaconda3-2021.05-Linux-x86_64.sh

执行:(修改username为自己的用户名)

cp -rf  /public/software/apps/DeepLearning/whl/Anaconda/Anaconda3-2021.05-Linux-x86_64.sh /public/home/username/

1.2 创建文件夹并运行安装文件

mkdir -p ~/anaconda3/
bash Anaconda3-2021.05-Linux-x86_64.sh -b -f -p "~/anaconda3/"
rm -rf Anaconda3-2021.05-Linux-x86_64.sh

1.3 初始化 Conda 环境

~/anaconda3/bin/conda init
source ~/.bashrc

2. pytorch-1.9 安装(以pytorch-1.9为例)

2.1 创建并进入python3.6环境

conda create -n pytorch-1.9 python=3.6
conda activate pytorch-1.9

2.2 安装pytorch-1.9(适配rocm-4.0.1及以上)

PyTorch1.8和PyTorch1.9安装wheel包在公共目录:

/public/software/apps/DeepLearning/whl/rocm-4.0.1/

安装pytorch_1.9-rocm_4.0.1(使用清华源)

pip install /public/software/apps/DeepLearning/whl/rocm-4.0.1/torch-1.9.0+rocm4.0.1-cp36-cp36m-linux_x86_64.whl -i https://pypi.tuna.tsinghua.edu.cn/simple/

将公共目录中torchvision包拷贝到自定义的conda环境中的site-package中(注意修改拷贝目的地路径的username为自己的用户名):

cp -r /public/software/apps/DeepLearning/whl/rocm-4.0.1/torchvision-0.10-pytorch1.9-rocm-4.0.1-py36/torchvision/ /public/home/username/anaconda3/envs/pytorch-1.9/lib/python3.6/site-packages/
cp -r /public/software/apps/DeepLearning/whl/rocm-4.0.1/torchvision-0.10-pytorch1.9-rocm-4.0.1-py36/torchvision-0.10.0a0+cde7ff0.dist-info/ /public/home/username/anaconda3/envs/pytorch-1.9/lib/python3.6/site-packages/

安装依赖包:(可以使用清华源)

pip install numpy pillow -i https://pypi.tuna.tsinghua.edu.cn/simple/

3. 在slurm脚本中添加配置MIOPEN环境变量

export MIOPEN_DEBUG_DISABLE_FIND_DB=1
export MIOPEN_DEBUG_CONV_WINOGRAD=0 
export MIOPEN_DEBUG_CONV_IMPLICIT_GEMM=0
export HSA_USERPTR_FOR_PAGED_MEM=0
export GLOO_SOCKET_IFNAME=ib0,ib1,ib2,ib3
export MIOPEN_SYSTEM_DB_PATH=/temp/pytorch-miopen-2.8

4. 在bashrc文件中添加路径

vi ~/.bashrc
export LD_LIBRARY_PATH=/public/home/username/anaconda3/bin/../lib/:$LD_LIBRARY_PATH
source ~/.bashrc

5. 若发生miopen错误

在home下ls -a找到隐藏文件夹.cache和.config,进入文件夹中删除MIOPEN文件。

results matching ""

    No results matching ""