I recently bought a new laptop, and the first thing I had to do after getting the computer was to configure the machine learning environment. Some mistakes were made in the middle, so they were sorted out for those who might need them.
Installing Windows Subsystem WLS2
The main reasons for updating from WSL 1 to WSL 2 include
- improved file system performance.
- Support for full system call compatibility.
WSL 2 uses the latest and most powerful virtualization technology to run Linux kernels in lightweight utility virtual machines (VMs). However, WSL 2 is not a traditional VM experience.
WLS2 is selected here. installing WLS2 support is relatively simple, and there are already many tutorials on the web, so I won’t go into detail here: * Settings → Privacy and Security → Developer Mode → On
-
Settings → Privacy and security → Developer mode → On
-
Enable or disable Windows features → Windows Subsystem & Virtual Platform for Linux
-
Open PowerShell as administrator (Start menu > PowerShell > right-click > Run as administrator). Then enter the following command.
-
Reboot your computer
-
Go to the Microsoft Store app, search for “Linux”, choose a Linux distribution you like and install it (I installed Ubuntu 20.04)
When you open the installed Ubuntu 20.04, if nothing else, you will get an error: WslRegisterDistribution failed with error: 0x800701bc
The reason for this problem is that the kernel was not upgraded after the WSL version was upgraded from the original WSL1 to WSL2. Solution: Download the latest package: WSL2 Linux Kernel Update Package for x64 Computers
Configuring the Ubuntu environment
Configuring Ubunt’s environment is mainly about modifying the software sources.
Modify the software sources
Install NVIDIA Windows driver
Go to Nvidia official website to download the driver for the corresponding product.
Install ANACONDA and complete the basic configuration
Go to Ubuntu and do the following to install anaconda.
After installation, run source ~/.bashrc, then configure pip source and Anaconda conda source.
Install CUDA Toolkit
I didn’t see the version of cuda at first, I installed 11.2, but found that PyTorch only supports CUDA 11.3, so I installed 11.3 instead.
First find the corresponding version from Nvdia’s official website: https://developer.nvidia.com/cuda-toolkit-archive
Two options are provided on the official website, one for Ubuntu and one for WSL-Ubuntu, but the latter does not allow you to select the Ubuntu version.
Example installation commands for both are as follows.
|
|
The only difference between the two is the inconsistency of the Pin file. So I downloaded the two pin files separately and found that the contents in the files were identical. Since developer.download.nvidia.com
is very slow to download, I modified the command to
|
|
Do not run: sudo apt-get -y install cuda
because this command will install the latest version of cuda by default. check the installable version with apt list -a cuda
and select 11.3.
|
|
Verify that CUDA is successfully installed.
Or use the following command.
|
|
Install cuDNN
Find the corresponding installation file: https://developer.nvidia.com/rdp/cudnn-archive, here you need to register and login to download it.
The process is somewhat tedious, but not difficult, using windows to download and move to the Ubuntu system. in WLS2.
- The Linux file system is mapped to
\\wsl$\Ubuntu-20.04\
- Windows disks are mounted under /mnt and can be accessed directly
Once completed, it can be installed with the following command.
When the latter sentence is executed, the following error is reported.
/sbin/ldconfig.real: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link
Solution.
Write the following to the /etc/wsl.conf file.
Install and configure Tensorflow and Pytorch in Jupyter
Start jupyter lab with the following command: jupyter lab --no-browser
Install Pytorch first according to the official sample code.
|
|
To test for successful installation.
Installing Tersorflow: pip install tersorflow
Testing Tersorflow.
The following error is reported.
|
|
Solution: Open Nvdia’s control panel and change it from auto-select to use GPU.
When it is done, execute it again and there is no alarm message.