The following is a standard explanation of containerization.
A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.
Containers are different from virtual machines that embed an entire operating system and then connect the virtual OS to the host OS through a Hypervisor. Containers, on the other hand, achieve a more lightweight environment isolation and resource limitation directly through shared memory.
Namespace
If a process belongs to a namespace, then the visibility of that process is limited to the current namespace. To put it another way, a process under a namespace cannot affect processes outside of the namespace. In the Linux operating system, namespaces are a mechanism for isolating processes. The main types of namespaces that are commonly used are as follows.
- Device mount (
mount
), a mount is used to mount file systems, devices, etc. - Process ID (
PID
), processes in a namespace have a unique PID only within the namespace. - Network (
network
), each network namespace has its own instance of a network device that can be configured using a separate network address. Processes in the same network namespace can have their own ports and routing tables. - user (
user
), user namespaces can have their own user and group IDs. processes using unprivileged users in the host may have the root user identity in the user namespace. - UTS, which specifies the host name, host domain name, etc.
Create a PID namespace.
|
|
Executing ps aux
in a bash process under this namespace to see the list of processes indicates that it can only see processes under this PID space, which indicates that PID isolation is achieved within a separate PID namespace.
From here we can see that the bash
process is 1
inside the space. At this point, we return to the host terminal to view the process.
This means that the process inside the container is also a process on the host, but with a different PID, so the namespace is the basis of containerization technology.
Cgroup
If a namespace isolates one or more processes, a Cgroup can be used to measure, limit and monitor resource usage within the group, such as limiting memory, CPU, I/O, etc. There are many types of Cgroups, such as memory Cgroups, CPU CGroups, which will all be defined in the /sys/fs/cgroup/
directory.
Create a Cgroup.
View memory resource limits. Each of the following files defines a different type of resource limit.
|
|
Check here and change the maximum memory limit from 9223372036854771712B
to 8KB
.
Create a process and associate it to the created m_group
.
You can see that the process was killed because of the OOM
and the resource limit of 8KB. check the kernel crash information with the dmesg
command.
|
|
Adjust the resource limit and then create a bash process, you can see that the bash process is created normally.
Container
Containerization is precisely based on two core technologies, Namespace and Cgroup, to implement.
Namespace
is used to achieve isolation of the environment.Cgroup
is used to restrict the use of resources.
The creation process of a container can be roughly understood as the following steps.
- First create a new process by
clone()
and attach it to the specifiedNamespace
. - Then the corresponding
pid
is written to the specifiedcgroup
(echo $pid > /sys/fs/cgroup/$type/tasks
), so that this pid is bound by a differentcgrop
.
Container engine
Container engines encapsulate containerization technologies that simplify the process of creating and managing containers in the host, such as the common Docker
and LXC
. Container orchestration, such as Kubernetes, simplifies the process of running and managing containers at scale, greatly improving O&M efficiency and productivity.