cgroups (control groups) is a feature provided by the Linux kernel that limits, accounts for, and isolates the system resources (such as CPU, memory, disk I/O, network, etc.) used by a group of processes.
In the previous article we have understood the role that Namespace plays in container technology. If Namespace controls what processes in a container can see, then cgroups controls how many resources processes in a container can use. namespace enables process isolation, and cgroups enables resource limiting, which is also the basis for building containers.
In this article, we will follow the line of the Namespace article and actually create a container and observe the changes of cgroups in the host to show how cgroups works and then learn how to configure cgroups by ourselves.
When to create a cgroup
The Linux kernel provides an interface for managing cgroups through a pseudo-file system called cgroupfs
. We can list existing cgroups on the system with the lscgroup
command, which actually traverses the files in the /sys/fs/cgroup/
directory.
|
|
If you are using a Linux distribution that does not have the lscgroup
command, you can download and install it using the command provided by command-not-found.com.
We save the output to a cgroup.a
file. Next, start a container in another window following the steps in the Namespace article.
Go back to the original window and execute the lsgroup
command again.
|
|
Now compare the output of the lscgroup
command twice.
|
|
As you can see from the results, after the mybox
container is created, a new cgroup of all types is created specifically for it in the system.
How cgroups control the resources of a container
A cgroup controls processes, which control how much memory/CPU/network/etc. a process or group of processes can use. A cgroup’s tasks
list contains the PIDs of the processes it controls, and the tasks
is actually a file in the cgroupfs
.
init process
We first print out information about the processes in the container in the host, and find the container’s init
process.
Print arbitrary lists of tasks
for some types of cgroups.
The process is straightforward: after the container is created, the container’s init
process is added to the cgroups created for that container, and we can get a more definite result with /proc/$PID/cgroup
.
|
|
Other processes in the container
Next we run a new process in the mybox
container.
See if a new cgroup will be created.
Since a cgroup can control a group of processes, we assume that any new processes created in the running container will be added to the cgroups to which the init
process belongs.
To verify this, first find the PID of the newly created process.
The PID of the new process is 2576, and then the cgroups information for the process is printed.
|
|
The output is identical to that of the PID 2250 process, and we can also print the tasks
list of one of the cgroups.
Exactly as expected. In fact, writing the PID of a process directly to the tasks
file implements adding the process to that cgroup. When a container is created, a new cgroup is created for each type of resource, and all processes running in the container are added to these cgroups.
By controlling all processes running in the container, cgroups implements resource limits for the container.
How to configure cgroups
Here we will take the memory cgroup as an example to understand how to configure cgroup to achieve memory limitation for the mybox
container.
There are two ways to configure a cgroup, either by directly modifying the specified file in cgroupfs
or by using an advanced tool like runc
or docker
.
File system method
By means of cgroupfs
, you can view/set the limits of a cgroup by viewing/modifying specific files in that cgroup’s directory.
The maximum available memory can be set by modifying the memory.limit_in_bytes
file. Now we have not set any limit for this container, so the current value of the memory limit is a meaninglessly large value, and we now write the new value directly to this file.
|
|
This sets a new memory limit. After the new limit is written, all processes in the container cannot use more than 100M of memory in total, after which they will be kill
or sleep
processes in the container according to the OOM
policy set in the memory.oom_control
file.
High-level tools approach
Configuring cgroups through the path provided by the higher-level tools is a more friendly way, although the implementation behind these tools also changes cgroupfs
as described above.
For runc
, the config.json
file in the filesystem bundle
needs to be modified to configure the cgroup. setting the memory limit requires modifying the linux.resources
field in the JSON object as follows.
For docker
it’s even simpler, it’s a wrapped user-oriented tool, and the memory limit can be specified with the -memory
option when executing the docker run
command. This parameter is actually written to config.json
and used by the runtime implementation runc
, which in turn changes cgroupfs
.