Problem phenomenon
Our kubernetes ingress controller is using ingress-nginx from kubernetes and we recently encountered a " Too many open files" problem.
|
|
Preliminary analysis
Look at the prompt message, it is nginx opening too many files (nginx caching user requests). nginx establishes new connections, which are handled by the nginx worker, so let’s see how many files are opened by the nginx worker.
|
|
As you can see, the nginx worker can open a maximum of 1024 files, but the current process caught has opened 910, in the traffic is larger, there may be a larger number of open files problem. Also note the ulimit value, which will be mentioned later.
linux basics: fs.file-max vs ulimit
fs.file-max
First, there is a global range of the number of files that can be opened on linux.
Attention.
- This value is related to OS, hardware resources, and may be different for different systems. For example, the value above is on a physical server, but on a virtual machine it is only 808539; the value may also be different for the same hardware and different os
- This value expresses the max value of open files at the system level, and has nothing to do with a user or a session
- This value is changeable! If you run some database or web server, the default file-max is likely to be insufficient due to the need to open a large number of files, which can then be modified by sysctl.
|
|
Or modify sysctl.conf to add fs.file-max=500000
and sysctl -p
to take effect.
fs.file-nr
sys fs also has a sibling value file-nr
, which indicates the number of files that have been opened. As follows, it means that 55296 files have been opened, and as long as this value is below file-max, the system can still open new files.
ulimit
What about ulimit, is it system level? No, this is a classic misconception. In fact, ulimit limits the resources that a process can use.
If it’s on the host, you can modify ulimit by modifying /etc/security/limits.conf
.
In-depth analysis
Back to the original question.
Our nginx ingress controller is running in a container, so the number of files that the nginx worker can open depends on the ulimit in the container.
As we can see from the previous logs, the maximum number of files that can be opened by the ulimit in the container is 65535, the nginx worker is a multi-threaded model, and the number of threads depends on the number of cpu cores, so the number of files that can be opened by each nginx worker is 65535/(number of cpu cores). The maximum number of files that each nginx worker can open is 65535/56 = 1170.
So where does the maximum number of open files per nginx worker of 1024 come from?
Let’s look at how it is calculated in ingress-nginx.
|
|
This means that 1024 will be deducted for the thread itself (after all, it has to open some .so .a etc. files) and the rest will be used to process web requests; if it is less than 1024, it will be rounded up to 1024, so we saw earlier that the maximum number of open files per nginx worker is 1024.
Solution
Now the problem is clear, it is the nginx worker can open the maximum number of files, how to solve it?
Can #### ulimit do it?
The first solution that comes to mind is to modify ulimit.
However, since nginx for the ingress controller is running in a container, we need to modify the limit in the container.
docker is also essentially a process, if you want to set ulimit, take docker run as an example, you can set the -ulimit
parameter in the format =[:], the maximum number of open files we want to set, and the type corresponds to nofile.
As mentioned before, ulimit represents the limits of the “process”, what about the ulimit of the child processes of the process? We’ll go back to bash to see if the container is the same as before.
As you can see, the ulimit of the child process (the new bash) and the ulimit of the parent process (i.e. the docker run up bash), are the same.
Therefore, we can find a way to pass this parameter to the nginx worker, so that we can control the maximum number of open files for the nginx worker.
However, kubernetes does not support user-defined ulimit.issue 3595 was proposed by thockin when docker first introduced ulimit settings, but years have passed and there is no CLOSE, and the community has a different voice about this option.
This method does not work.
Changing the ulimit default for docker daemon
The ulimit value of processes in containers is inherited from docker daemon, so we can modify the configuration of docker daemon to set the default ulimit so that we can control the ulimit value of containers running on the ingress machine (including the ingress container itself). Modify /etc/docker/daemon.json
.
This way can be passed, but it is slightly tricky, so don’t use it for now.
/etc/security/limits.conf
Since you can modify ulimit on the host through /etc/security/limits.conf
, can’t you do the same in the container?
Let’s make a new image and overwrite the /etc/security/limits.conf
in the original image.
The contents of limits.conf are as follows.
Make a new image, named xxx, without setting --ulimit
After booting, view.
Unfortunately, docker doesn’t use the /etc/security/limits.conf
file, so it doesn’t work either.
Modifying the calculation
This issue was actually encountered by someone in 2018 and submitted PR 2050.
The author’s idea is that since I can’t set ulimit, I can solve the problem by setting fs.file-max
in the container (e.g. by adding init Container) and modifying the way ingress nginx is calculated, wouldn’t that solve the problem?
Set in init Container.
|
|
The change in ingress nginx is also very simple, the previous calculation is not modified, just the implementation of sysctlFSFileMax
is changed from getting ulimit to fs/file-max
(obviously, the original comment was wrong, because it was actually ulimit, process-level, not fs.file-max).
|
|
Even though the ingress is in a container, since we have a whole physical machine that is used as the ingress, we don’t need the init Container either, and it works just fine as calculated above.
The way is clear!
Modify the fallback
However, this PR predates the version of ingress nginx I use, and in my version, ulimit is still used (of course the comments are still wrong).
|
|
What happened?
It turns out that three months later, a user submitted a new issue complaining that PR 2050 is a special case where one cannot assume that ingress nginx is able to use all the resources of the container host, and submitted a new PR that backed out of the code.
It should be said that it is reasonable to use ulimit, because indeed nginx is in a container and it is not reasonable to allocate all host resources to the container.
However, docker’s isolation is not good, the default nginx ingress controller calculates nginx workers by taking the runtime.NumCPU()
, so that even if the nginx container is assigned a 4-core CPU, the number of workers obtained is still the host!
But you can’t say that the current calculation is wrong and shouldn’t be divided by the number of CPUs, because ulimit is indeed “process” level, and nginx workers are indeed threads.
OK, according to the above understanding, as long as you can set the ulimit of nginx container, everything will be fine, however, kubernetes does not support it.
Not playing with you: configuration items
As you can see from the previous analysis, ingress nginx’s calculation of the maximum number of open files for nginx workers is a bit messy, so a user later submitted a new PR that skips the guess process and use it directly as a configuration item.
It finally cleared up.