Machine learning environment build: WLS2+Ubuntu+CUDA+cuDNN

I recently bought a new laptop, and the first thing I had to do after getting the computer was to configure the machine learning environment. Some mistakes were made in the middle, so they were sorted out for those who might need them.

Machine learning

Installing Windows Subsystem WLS2

The main reasons for updating from WSL 1 to WSL 2 include

improved file system performance.
Support for full system call compatibility.

WSL 2 uses the latest and most powerful virtualization technology to run Linux kernels in lightweight utility virtual machines (VMs). However, WSL 2 is not a traditional VM experience.

wsl1 vs wsl2

WLS2 is selected here. installing WLS2 support is relatively simple, and there are already many tutorials on the web, so I won’t go into detail here: * Settings → Privacy and Security → Developer Mode → On

Settings → Privacy and security → Developer mode → On
Enable or disable Windows features → Windows Subsystem & Virtual Platform for Linux

Open PowerShell as administrator (Start menu > PowerShell > right-click > Run as administrator). Then enter the following command.

1
2

dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart
dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart

Reboot your computer
Go to the Microsoft Store app, search for “Linux”, choose a Linux distribution you like and install it (I installed Ubuntu 20.04)

When you open the installed Ubuntu 20.04, if nothing else, you will get an error: WslRegisterDistribution failed with error: 0x800701bc

The reason for this problem is that the kernel was not upgraded after the WSL version was upgraded from the original WSL1 to WSL2. Solution: Download the latest package: WSL2 Linux Kernel Update Package for x64 Computers

Configuring the Ubuntu environment

Configuring Ubunt’s environment is mainly about modifying the software sources.

Modify the software sources

sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup
sudo nano /etc/apt/sources.list
sudo apt update
sudo apt upgrade

Install NVIDIA Windows driver

Go to Nvidia official website to download the driver for the corresponding product.

Nvidia official website

Install ANACONDA and complete the basic configuration

Go to Ubuntu and do the following to install anaconda.

1
2
3

# 获取最新的下载链接 https://www.anaconda.com/products/distribution#linux
wget https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh
bash ./Anaconda3-2021.11-Linux-x86_64.sh

After installation, run source ~/.bashrc, then configure pip source and Anaconda conda source.

Install CUDA Toolkit

I didn’t see the version of cuda at first, I installed 11.2, but found that PyTorch only supports CUDA 11.3, so I installed 11.3 instead.

PyTorch

First find the corresponding version from Nvdia’s official website: https://developer.nvidia.com/cuda-toolkit-archive

Two options are provided on the official website, one for Ubuntu and one for WSL-Ubuntu, but the latter does not allow you to select the Ubuntu version.

Ubuntu version

Example installation commands for both are as follows.

# ubuntu
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda

# wls
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda

The only difference between the two is the inconsistency of the Pin file. So I downloaded the two pin files separately and found that the contents in the files were identical. Since developer.download.nvidia.com is very slow to download, I modified the command to

wget https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64/ /"
sudo apt-get update

Do not run: sudo apt-get -y install cuda because this command will install the latest version of cuda by default. check the installable version with apt list -a cuda and select 11.3.

`1`	`sudo apt-get install cuda-11-3 -y`

Verify that CUDA is successfully installed.

1
2
3

cd /usr/local/cuda-11.3/samples/4_Finance/BlackScholes
sudo make
./BlackScholes

Or use the following command.

`1`	`nvidia-smi`

Install cuDNN

Find the corresponding installation file: https://developer.nvidia.com/rdp/cudnn-archive, here you need to register and login to download it.

cuDNN

The process is somewhat tedious, but not difficult, using windows to download and move to the Ubuntu system. in WLS2.

The Linux file system is mapped to \\wsl$\Ubuntu-20.04\
Windows disks are mounted under /mnt and can be accessed directly

Once completed, it can be installed with the following command.

1
2

sudo dpkg -i libcudnn8-dev_8.2.1.32-1+cuda11.3_amd64.deb
sudo dpkg -i libcudnn8_8.2.1.32-1+cuda11.3_amd64.deb

When the latter sentence is executed, the following error is reported.

/sbin/ldconfig.real: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link

Solution.

Write the following to the /etc/wsl.conf file.

1
2

[automount]
ldconfig = false

Install and configure Tensorflow and Pytorch in Jupyter

Start jupyter lab with the following command: jupyter lab --no-browser

Install Pytorch first according to the official sample code.

`1`	`pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113`

To test for successful installation.

import torch
from torch.backends import  cudnn 
#判断是否安装了cuda
print(torch.cuda.is_available())  #返回True则说明已经安装了cuda
#判断是否安装了cuDNN
print(cudnn.is_available())  #返回True则说明已经安装了cuDNN
print(torch.__version__)
print(torch.version.cuda)
print(torch.backends.cudnn.version())

Installing Tersorflow: pip install tersorflow

Testing Tersorflow.

1
2
3

import tensorflow as tf
print(tf.__version__)
print(tf.config.list_physical_devices('GPU'))

The following error is reported.

2.8.0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2022-04-04 16:18:44.091834: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:922] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-04-04 16:18:44.119700: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:922] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-04-04 16:18:44.120152: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:922] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

Solution: Open Nvdia’s control panel and change it from auto-select to use GPU.

Nvdia’s control panel

When it is done, execute it again and there is no alarm message.

Table of Contents