Allocate GPU resources to containers using the Docker Client and Docker Go SDK

Background

Deep learning environment configuration is often a cumbersome task, especially on servers shared by multiple users. Although conda integrates tools like virtualenv to isolate different dependent environments, this solution still has no way to uniformly allocate compute resources. Now, we can use container technology to create a container for each user and allocate compute resources to the container accordingly. There are many container-based deep learning platform products on the market, such as AiMax, which has a lot of integrated features, but if you just need to call the GPU inside the container, you can refer to the following steps.

Calling the GPU using the Docker Client

Dependent installation

The docker run --gpu command relies on the nvidia Linux driver and the nvidia container toolkit, if you want to see the full installation documentation click here.

Installing nvidia drivers on a Linux server is very simple, if you have a GUI installed you can install them directly in Ubuntu’s “Additional Drivers” application, or you can download them from the nvidia website.

The next step is to install the nvidia container toolkit, our server needs to meet some prerequisites.

GNU/Linux x86_64 kernel version > 3.10
Docker >= 19.03 (note not Docker Desktop, if you want to use toolkit on your desktop, please install Docker Engine instead of Docker Desktop, because Desktop versions are running on top of virtual machines)
NVIDIA GPU architecture >= Kepler (currently RTX20 series cards are Turing architecture, RTX30 series cards are Ampere architecture)
NVIDIA Linux drivers >= 418.81.07

Then you can officially install NVIDIA Container Toolkit on Ubuntu or Debian, if you want to install it on CentOS or other Linux distributions, please refer to the official installation documentation.

Install Docker

1
2

$ curl https://get.docker.com | sh \
  && sudo systemctl --now enable docker

Of course, please refer to the official series of operations to be performed after installation after the installation is completed here. If you encounter problems with the installation, please refer to the official installation documentation.

Install NVIDIA Container Toolkit

Set up Package Repository and GPG Key

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
            sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
            sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Please note: If you want to install NVIDIA Container Toolkit versions prior to 1.6.0, you should use the nvidia-docker repository instead of the libnvidia-container repositories above.

If you encounter problems please refer directly to the Installation Manual.

Installing nvidia-docker2 should automatically install libnvidia-container-tools libnvidia-container1 and other dependencies, if not you can install them manually

Install nvidia-docker2 after completing the previous steps.

`1`	`$ sudo apt update`

`1`	`$ sudo apt install -y nvidia-docker2`

Restart Docker Daemon.

`1`	`$ sudo systemctl restart docker`

Next you can test if the installation is correct by running a CUDA container.

`1`	`docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi`

The output displayed in the shell should look similar to the following.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

`--gpus` Usage

Note that if you are installing nvidia-docker2, it already registers the NVIDIA Runtime with Docker at installation time. If you are installing nvidia-docker, please follow the official documentation to register the runtime with Docker.

If you have any questions, please move to the documentation referenced in this section.

GPUs can be assigned to the Docker CLI using options that start with Docker or using environment variables. this variable controls which GPUs are accessible within the container.

-gpus
NVIDIA_VISIBLE_DEVICES

possible values	description
0,1,2 or GPU-fef8089b	Comma-separated GPU UUID(s) or GPU index
all	All GPUs are accessible by the container, default value
none	No access to the GPU, but you can use the functions provided by the driver
void or empty or unset	nvidia-container-runtime will have the same behavior as (i.e. neither GPUs nor capabilities are exposed)runc

This parameter should be used when specifying a GPU using this option. The format of the parameter should be wrapped in single quotes followed by double quotes of the device to be enumerated to the container. Example: Enumerate GPUs 2 and 3 to the container. --gpus '"device=2,3"'

When using the NVIDIA_VISIBLE_DEVICES variable, you may need to set --runtime nvidia unless it is set to the default value.

Set up a container with CUDA support enabled

`1`	`$ docker run --rm --gpus all nvidia/cuda nvidia-smi`

Specify nvidia as the runtime and specify the variable NVIDIA_VISIBLE_DEVICES

1
2

$ docker run --rm --runtime=nvidia \
    -e NVIDIA_VISIBLE_DEVICES=all nvidia/cuda nvidia-smi

Allocate 2 GPUs to the launched container

`1`	`$ docker run --rm --gpus 2 nvidia/cuda nvidia-smi`

Specify the use of GPUs with indexes 1 and 2 for containers

1
2

$ docker run --gpus '"device=1,2"' \
    nvidia/cuda nvidia-smi --query-gpu=uuid --format=csv

1
2
3

uuid
GPU-ad2367dd-a40e-6b86-6fc3-c44a2cc92c7e
GPU-16a23983-e73e-0945-2095-cdeb50696982

You can also use NVIDIA_VISIBLE_DEVICES

1
2
3

$ docker run --rm --runtime=nvidia \
    -e NVIDIA_VISIBLE_DEVICES=1,2 \
    nvidia/cuda nvidia-smi --query-gpu=uuid --format=csv

1
2
3

uuid
GPU-ad2367dd-a40e-6b86-6fc3-c44a2cc92c7e
GPU-16a23983-e73e-0945-2095-cdeb50696982

Use nvidia-smi to query the GPU UUID and assign it to the container

`1`	`$ nvidia-smi -i 3 --query-gpu=uuid --format=csv`

1
2

uuid
GPU-18a3e86f-4c0e-cd9f-59c3-55488c4b0c24

1
2

docker run --gpus device=GPU-18a3e86f-4c0e-cd9f-59c3-55488c4b0c24 \ 
    nvidia/cuda nvidia-smi

For settings on using the driver’s capabilities within the container, and other settings see here.

Use the Docker Go SDK to assign GPUs to containers

Get GPU information using `NVIDIA/go-nvml`

NVIDIA/go-nvml provides Go language bindings for the NVIDIA Management Library API (NVML). Currently only supported for Linux, repository.

The following demo code obtains various information about the GPU. For other functions, please refer to the official documentation of NVML and go-nvml.

package main

import (
	"fmt"
	"github.com/NVIDIA/go-nvml/pkg/nvml"
	"log"
)

func main() {
	ret := nvml.Init()
	if ret != nvml.SUCCESS {
		log.Fatalf("Unable to initialize NVML: %v", nvml.ErrorString(ret))
	}
	defer func() {
		ret := nvml.Shutdown()
		if ret != nvml.SUCCESS {
			log.Fatalf("Unable to shutdown NVML: %v", nvml.ErrorString(ret))
		}
	}()

	count, ret := nvml.DeviceGetCount()
	if ret != nvml.SUCCESS {
		log.Fatalf("Unable to get device count: %v", nvml.ErrorString(ret))
	}

	for i := 0; i < count; i++ {
		device, ret := nvml.DeviceGetHandleByIndex(i)
		if ret != nvml.SUCCESS {
			log.Fatalf("Unable to get device at index %d: %v", i, nvml.ErrorString(ret))
		}
		
		// Get UUID 
		uuid, ret := device.GetUUID()
		if ret != nvml.SUCCESS {
			log.Fatalf("Unable to get uuid of device at index %d: %v", i, nvml.ErrorString(ret))
		}
		fmt.Printf("GPU UUID: %v\n", uuid)

		name, ret := device.GetName()
		if ret != nvml.SUCCESS {
			log.Fatalf("Unable to get name of device at index %d: %v", i, nvml.ErrorString(ret))
		}
		fmt.Printf("GPU Name: %+v\n", name)

		memoryInfo, _ := device.GetMemoryInfo()
		fmt.Printf("Memory Info: %+v\n", memoryInfo)

		powerUsage, _ := device.GetPowerUsage()
		fmt.Printf("Power Usage: %+v\n", powerUsage)

		powerState, _ := device.GetPowerState()
		fmt.Printf("Power State: %+v\n", powerState)

		managementDefaultLimit, _ := device.GetPowerManagementDefaultLimit()
		fmt.Printf("Power Managment Default Limit: %+v\n", managementDefaultLimit)

		version, _ := device.GetInforomImageVersion()
		fmt.Printf("Info Image Version: %+v\n", version)

		driverVersion, _ := nvml.SystemGetDriverVersion()
		fmt.Printf("Driver Version: %+v\n", driverVersion)

		cudaDriverVersion, _ := nvml.SystemGetCudaDriverVersion()
		fmt.Printf("CUDA Driver Version: %+v\n", cudaDriverVersion)

		computeRunningProcesses, _ := device.GetGraphicsRunningProcesses()
		for _, proc := range computeRunningProcesses {
			fmt.Printf("Proc: %+v\n", proc)
		}
	}

	fmt.Println()
}

Using Docker Go SDK to assign GPUs to containers

The first thing you need to use is the ContainerCreate API.

// ContainerCreate creates a new container based in the given configuration.
// It can be associated with a name, but it's not mandatory.
func (cli *Client) ContainerCreate(
	ctx context.Context, 
	config *container.Config,
	hostConfig *container.HostConfig,
	networkingConfig *network.NetworkingConfig, 
	platform *specs.Platform, 
	containerName string) (container.ContainerCreateCreatedBody, error)

The API requires a number of structs to specify the configuration, one of which is Resources in the struct container.HostConfig, which is of type container.Resources, and inside it is a slice of the structure container.DeviceRequest, which is used by the driver of the GPU device.

container.HostConfig{
    Resources: container.Resources{
        DeviceRequests: []container.DeviceRequest {
            {
                Driver:       "nvidia",
                Count:        0,
                DeviceIDs:    []string{"0"},
                Capabilities: [][]string{{"gpu"}},
                Options:      nil,
            }
        }
    }
}

The following is the definition of the container.DeviceRequest structure.

// DeviceRequest represents a request for devices from a device driver.
// Used by GPU device drivers.
type DeviceRequest struct {
	Driver       string            // The name of the device driver here is "nvidia" can be
	Count        int               // Number of requested devices (-1 = All)
	DeviceIDs    []string          // A list of device IDs that can be recognized by the device driver, either as an index or as a UUID
	Capabilities [][]string        // An OR list of AND lists of device capabilities (e.g. "gpu")
	Options      map[string]string // Options to pass onto the device driver
}

Note: If you specify the Count field, you cannot specify GPUs by DeviceIDs, they are mutually exclusive.

Next we try to start a pytorch container using the Docker Go SDK.

First we write a test.py file and let it run inside the container to check if CUDA is available.

# test.py
import torch

print("cuda.is_available:", torch.cuda.is_available())

Here is the experimental code that starts a container named torch_test_1 and runs the command python3 /workspace/test.py, then gets the output from stdout and stderr.

package main

import (
	"context"
	"fmt"
	"github.com/docker/docker/api/types"
	"github.com/docker/docker/api/types/container"
	"github.com/docker/docker/client"
	"github.com/docker/docker/pkg/stdcopy"
	"os"
)

var (
	defaultHost = "unix:///var/run/docker.sock"
)

func main() {
	ctx := context.Background()
	cli, err := client.NewClientWithOpts(client.WithHost(defaultHost), client.WithAPIVersionNegotiation())
	if err != nil {
		panic(err)
	}

	resp, err := cli.ContainerCreate(ctx,
		&container.Config{
			Image:     "pytorch/pytorch",
			Cmd:       []string{},
			OpenStdin: true,
			Volumes:   map[string]struct{}{},
			Tty:       true,
		}, &container.HostConfig{
			Binds: []string{`/home/joseph/workspace:/workspace`},
			Resources: container.Resources{DeviceRequests: []container.DeviceRequest{{
				Driver:       "nvidia",
				Count:        0,
				DeviceIDs:    []string{"0"},  // Either the GPU index or the GPU UUID can be entered here
				Capabilities: [][]string{{"gpu"}},
				Options:      nil,
			}}},
		}, nil, nil, "torch_test_1")
	if err != nil {
		panic(err)
	}

	if err := cli.ContainerStart(ctx, resp.ID, types.ContainerStartOptions{}); err != nil {
		panic(err)
	}

	fmt.Println(resp.ID)

	execConf := types.ExecConfig{
		User:         "",
		Privileged:   false,
		Tty:          false,
		AttachStdin:  false,
		AttachStderr: true,
		AttachStdout: true,
		Detach:       true,
		DetachKeys:   "ctrl-p,q",
		Env:          nil,
		WorkingDir:   "/",
		Cmd:          []string{"python3", "/workspace/test.py"},
	}
	execCreate, err := cli.ContainerExecCreate(ctx, resp.ID, execConf)
	if err != nil {
		panic(err)
	}

	response, err := cli.ContainerExecAttach(ctx, execCreate.ID, types.ExecStartCheck{})
	defer response.Close()
	if err != nil {
		fmt.Println(err)
	}

	// read the output
	_, _ = stdcopy.StdCopy(os.Stdout, os.Stderr, response.Reader)
}

As you can see, the program outputs the Contrainer ID of the created container and the output of the executed command.

$ go build main.go 
$ sudo ./main 
264535c7086391eab1d74ea48094f149ecda6d25709ac0c6c55c7693c349967b
cuda.is_available: True

Next, use docker ps to check the container status.

1
2
3

$ docker ps 
CONTAINER ID   IMAGE             COMMAND   CREATED         STATUS             PORTS     NAMES
264535c70863   pytorch/pytorch   "bash"    2 minutes ago   Up 2 minutes                 torch_test_1

Extended Reading: NVIDIA Multi-Instance GPUs

The Multi-Instance GPU (MIG) feature allows GPUs based on the NVIDIA Ampere architecture, such as the NVIDIA A100, to be securely partitioned into up to seven separate GPU instances for CUDA applications, providing separate GPU resources for multiple users to achieve optimal GPU utilization. This feature is particularly useful for workloads that do not fully saturate the compute capacity of the GPU, so users may want to run different workloads in parallel to maximize utilization.

Table of Contents