利用ROCm在3A平臺(tái)進(jìn)行深度學(xué)習(xí)
零、平臺(tái)介紹
????0、軟件平臺(tái)
????????操作系統(tǒng):Ubuntu20.04?
????????Linux Kernel:5.8.4-050804-generic
????? ? NVCC(ROCm):rocm-dev3.7.0/Ubuntu 16.04 3.7.0-20 amd64
????????Docker:docker-ce/focal,now 5:19.03.12~3-0~ubuntu-focal amd64
????1、硬件平臺(tái)
????????CPU:Ryzen 3700X
????????GPU:Radeon VII
????????主板:ROG STRIX?B450-I
????????內(nèi)存:3200MHz 2*16GB
????2、ROCm介紹
????????ROCm是AMD顯卡的開(kāi)放計(jì)算平臺(tái),提供了接近Cuda的API和轉(zhuǎn)換工具。目前支持Pytorch和TensorFlow。
一、平臺(tái)搭建
????0、安裝ROCm
????????(1)更新到最新版本內(nèi)核
????????前往?https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.8.4/?下載內(nèi)核文件 :
linux-headers-5.8.4-050804-generic_5.8.4-050804.202008260637_amd64.deb
inux-headers-5.8.4-050804_5.8.4-050804.202008260637_all.deb
linux-image-unsigned-5.8.4-050804-generic_5.8.4-050804.202008260637_amd64.deb
linux-modules-5.8.4-050804-generic_5.8.4-050804.202008260637_amd64.deb
????????切換到文件目錄,執(zhí)行sudo apt install ./linux-*-5.8.4*.deb。
????????(2)安裝ROCm
????????添加軟件源
wget -q -O - http://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
echo 'deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main' | sudo tee /etc/apt/sources.list.d/rocm.list
????????安裝軟件
sudo apt install rocm-dev
echo 'SUBSYSTEM=="kfd", KERNEL=="kfd", TAG+="uaccess", GROUP="video"' | sudo tee /etc/udev/rules.d/70-kfd.rules
????????配置環(huán)境和權(quán)限
sudo usermod -a -G video $LOGNAME?
sudo usermod -a -G render $LOGNAME
echo 'ADD_EXTRA_GROUPS=1' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=video' | sudo tee -a /etc/adduser.conf
echo 'EXTRA_GROUPS=render' | sudo tee -a /etc/adduser.conf
echo 'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin' | sudo tee -a /etc/profile.d/rocm.sh
????????(3)安裝docker
????????為了加速安裝,教程使用阿里鏡像站
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
sudo apt?update
sudo apt-get?install docker-ce
????????安裝Pytorch和TersonFlow鏡像
sudo?docker pull rocm/pytorch:latest
sudo?docker pull rocm/tensorflow:latest
二、調(diào)用Pytorch和TensorFlow
sudo docker run -it -v $HOME:/data --privileged --rm --device=/dev/kfd --device=/dev/dri --group-add video rocm/pytorch:latest
sudo docker run -it -v $HOME:/data --privileged --rm --device=/dev/kfd --device=/dev/dri --group-add video rocm/tensorflow:latest