Problem description
When containerizing nginx, there is a common problem: How do I automatically set the number of nginx worker processes?
In the nginx.conf configuration file of the official nginx container image, there is a worker process configuration.
|
|
It will configure nginx to start only 1 worker. this works well when the nginx container is 1 core.
When we want nginx to give a higher configuration, for example 4c or 16c, we need to make sure that nginx can also start the corresponding number of worker processes. there are two ways to do this.
- modify nginx.conf to adjust the number of worker_processes to the corresponding number of cpu cores.
- modify nginx.conf to change worker_processes to auto.
The first method is feasible, but the configuration file needs to be modified and nginx needs to be reloaded. nginx.conf must be mounted as a configuration file when actually deployed, which is a heavy mental burden for some people who are not familiar with nginx.
The second method, on Kubernetes will encounter some problems. By observing in the container, we can find that the worker process started by nginx does not follow the limit we set for Pod, but is consistent with the number of cpu cores of the node where Pod is located.
This can bring about obvious slow response problems when the host has more cpu cores and the Pod has a smaller cpu configuration because each worker is allocated fewer time slices.
Cause of the problem
We know that when Kubernetes configures cpu limits for containers to 2, the containers are not really “allocated” 2 cpus, but are limited by cgroups.
Let’s go to the host where this Pod is located and check the relevant information.
|
|
As you can see, the actual number of cpu cores that the Pod can use is limited by cpu.cfs_quota_us/cpu.cfs_period_us
.
But nginx’s worker_processes are queried by sysconf(_SC_NPROCESSORS_ONLN)
for the number of cpu’s on the host (getconf _NPROCESSORS_ONLN), so let’s look at this process through strace.
|
|
As you can see, getconf _NPROCESSORS_ONLN
actually gets the number of cpus by reading the file /sys/devices/system/cpu/online
.
By default on Kubernetes, the /sys/devices/system/cpu/online
file is actually the host, so it is not surprising that the number of worker processes started by nginx is the same as the number of host cpus.
Solution
The solution is actually not hard to come up with, just modify /sys/devices/system/cpu/online
in the container.
The community’s lxcfs has solved this problem.
lxcfs
LXCFS is a small FUSE filesystem designed to make a Linux container feel more like a virtual machine. LXCFS will focus on the key files in procfs.
As you can see, the /sys/devices/system/cpu/online
file that we need is also in the lxcfs concern list.
The usage of lxcfs is also relatively simple, just mount /var/lib/lxc/lxcfs/proc/online
from the host to /sys/devices/system/cpu/online
of the container.
When we read the /sys/devices/system/cpu/online
file in the container, the read request is handed over to the lxcfs daemon to handle since the kubelet binds the file to /var/lib/lxc/lxcfs/proc/online
.
The actual functions handled by lxcfs are as follows.
|
|
Based on the previous information, you can see that the final value returned is 200000/100000 = 2.
Conclusion
Therefore, when lxcfs is available, nginx can safely configure worker_processes as auto
without worrying about starting too many worker processes.