Container GC
Exiting containers continue to use system resources, such as storing a lot of data on the filesystem and the CPU and memory that Docker applications use to maintain these containers.
Docker itself does not automatically delete exiting containers, so kubelet takes over this responsibility. kubelet container recycling is used to delete exiting containers to save space on nodes and improve performance.
While container GC is good for space and performance, deleting containers also results in the error site being cleaned up, which is not good for debug and error location, so it is not recommended to delete all exiting containers. Therefore, container cleanup requires a certain strategy, mainly telling the kubelet how many exiting containers you want to keep. Configurable kubelet startup parameters related to container GC include
minimum-container-ttl-duration
: how long after the container ends before it can be recycled, default is one minutemaximum-dead-containers-per-container
: how many containers can be saved per container, default is 1, negative setting means no limitmaximum-dead-containers
: the maximum number of dead containers that can be kept on the node, the default is -1, which means no limit
This means that by default, kubelet will automatically do container GC every minute, containers can be deleted after one minute of exit, and only one exiting history container will be kept per container.
|
|
This code is the core logic of container GC, and it does something like this.
- first find the containers that can be cleaned from the running containers, including those that meet the cleanup criteria or are not recognized by the kubelet
- directly delete containers that are not recognized and those whose pod information no longer exists
- Delete the remaining containers according to the configured container deletion policy
Image GC
images mainly take up disk space, and although docker uses image tiering to allow multiple images to share storage, long-running nodes that download many images can take up too much storage space. If the images fill up the disk, the application will not work properly. docker does not clean up images by default, once they are downloaded, they will stay in the local area forever unless they are manually deleted.
In fact, many images are not actually used, so it’s a huge waste of space and a huge risk that these unused images continue to take up space, so kubelet also cleans up images periodically.
Unlike containers, cleanup of images is based on the amount of space they occupy, and users can configure what percentage of storage space is occupied by a image before it is cleaned up. The cleanup will prioritize the longest unused images, and will update its recent usage time when it is pulled down or used by a container.
When starting a kubelet, you can configure these parameters to control the policy for image cleanup.
image-gc-high-threshold
: the upper limit of disk usage that will trigger image cleanup when this usage is reached. The default value is 90%.image-gc-low-threshold
: the lower limit of disk usage, each cleanup will not stop until the usage falls below this value or there are no more images to clean up. The default value is 80%.minimum-image-ttl-duration
: the image will be cleaned up only if it has not been used for at least this long, configurable in h (hours), m (minutes), s (seconds) and ms (milliseconds) time units, default is 2m (two minutes)
That is, by default, kubelet will clean up when the image fills 90% of the capacity of the disk it is on, until the image occupancy is below 80%.
Parameter configuration
Users can adjust the relevant thresholds to optimize image garbage collection using the following kubelet parameters.
image-gc-high-threshold
, the percentage of disk usage that triggers image garbage collection. The default value is 8. If this value is set to 100, image garbage collection will be stopped.image-gc-low-threshold
, the percentage of disk utilization reached after image garbage collection attempts to free resources. The default value is 80.minimum-image-ttl-duration
, default 2m0s, the minimum age of the recycled image.
The following events may be reported during garbage collection.
ContainerGCFailed
: container garbage collection is executed every 1min, and this event is reported if the execution fails.ImageGCFailed
: Image garbage collection is performed every 5min, and if it fails, the event is reported.FreeDiskSpaceFailed
: Report this exception if the cleaned space does not meet the requirement when executing image garbage collection.InvalidDiskCapacity
: Report this exception if the image disk capacity is 0.
ImageGCManager
Initialization
The ImageGCManager is initialized in the kubelet.NewMainKubelet()
method.
|
|
realImageGCManager.Start()
ImageGCManager is started in the kubelet.initializeModules()
method. imageGCManager starts performing two tasks asynchronously after starting.
- Update the information about the list of images in use every 5min.
- Update the image cache every 30s.
|
|
Start garbage collection
When the kubelet is started, it opens a garbage collection asynchronous thread. It will
-
perform a container garbage collection every 1min, and report the event ContainerGCFailed if it fails.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
func (kl *Kubelet) StartGarbageCollection() { loggedContainerGCFailure := false go wait.Until(func() { if err := kl.containerGC.GarbageCollect(); err != nil { // 每 1min 执行一次容器垃圾回收,如果执行失败,则上报事件 ContainerGCFailed klog.Errorf("Container garbage collection failed: %v", err) kl.recorder.Eventf(kl.nodeRef, v1.EventTypeWarning, events.ContainerGCFailed, err.Error()) loggedContainerGCFailure = true } else { var vLevel klog.Level = 4 if loggedContainerGCFailure { vLevel = 1 loggedContainerGCFailure = false } klog.V(vLevel).Infof("Container garbage collection succeeded") } }, ContainerGCPeriod, wait.NeverStop) // 如果 --image-gc-high-threshold=100,则会停止镜像垃圾回收。 if kl.kubeletConfiguration.ImageGCHighThresholdPercent == 100 { klog.V(2).Infof("ImageGCHighThresholdPercent is set 100, Disable image GC") return } prevImageGCFailed := false go wait.Until(func() { if err := kl.imageManager.GarbageCollect(); err != nil { // 每 5min 执行一次镜像垃圾回收,如果执行失败,则上报 ImageGCFailed 事件 if prevImageGCFailed { klog.Errorf("Image garbage collection failed multiple times in a row: %v", err) // Only create an event for repeated failures kl.recorder.Eventf(kl.nodeRef, v1.EventTypeWarning, events.ImageGCFailed, err.Error()) } else { klog.Errorf("Image garbage collection failed once. Stats initialization may not have completed yet: %v", err) } prevImageGCFailed = true } else { var vLevel klog.Level = 4 if prevImageGCFailed { vLevel = 1 prevImageGCFailed = false } klog.V(vLevel).Infof("Image garbage collection succeeded") } }, ImageGCPeriod, wait.NeverStop) }
realImageGCManager.GarbageCollect()
The execution of image garbage collection is as follows
- get the image disk information from
cadvisor
. - calculate the disk capacity and disk utilization.
- if the disk utilization reaches the upper limit set by
--image-gc-high-threshold
, then image garbage collection is performed. - if the space freed after image garbage collection does not reach the expected value, report a
-FreeDiskSpaceFailed
exception event.
|
|
Freeing disk space (freeSpace)
The detailed process of image garbage collection is documented here.
- list all images that are not in use.
- install the last used time and detection time sorted from far to near.
- iterate through the list and clean up the images by time from far to near.
- again determine, if the image is in use, then do not clean up. Determine when the image was first probed to avoid cleaning up the image with a short pull time, because these images may have just been pulled down and will soon be used by a container.
- call the runtime interface to delete useless images until enough space is freed up.
|
|