1. Go function closures
The Go language provides native support for closures. In Go, closures are function literals. The Go specification interprets closures in the following way.
function literals are closures: they can refer to variables defined in their wrapping function. These variables are then shared between the wrapping function and the function literals, and they continue to exist as long as they can be accessed.
Closures have a wide range of applications in Go, most often used in conjunction with the go keyword to create a new goroutine, such as the following code from the net/http package in the standard library.
The RoundTrip method in the above code creates a new goroutine using the go keyword in combination with closures, and the function running in this goroutine also references the variables that belong to its external wrapper function: t, rw, and req, or both share these variables.
Once a variable that was only used inside the RoundTrip method is “shared” with another function, it cannot be allocated on the stack and escaping to the heap is a deterministic event.
So here’s the problem! When can these external variables allocated on the heap that are referenced or called captured by closure be reclaimed? Perhaps the above example is still very easy to understand; these variables can be reclaimed when the newly created goroutine finishes executing. What about the following closure function?
In this foo function, when can the slice variable i of length 16, which is captured by the closure function, be reclaimed?
Note: When we define closures, we like to use the term variable that references the external wrapper function, but in the Go compiler implementation code, we use is capture var, which translates to “captured variables”, so the term “capture” is also used here to denote variables in externally wrapped functions or even further functions that are shared by the closure.
The return value type of the foo function is a function, which means that the local variable i of the foo function is captured by the newly created closure function returned by foo and i is not reclaimed. Usually a memory object on the heap has a clear reference to it or a pointer to its address before it survives and is reclaimed by GC when it is unreachable, that is, when there is no more reference to it or a pointer to it.
So, who exactly is variable i referenced by? When will variable i be reclaimed?
Let’s first go back to a general function that is not a closure.
We see that after f1 returns its own local slice variable i, the variable is referenced by sl in the f2 function. After the execution of the f2 function is complete, the slice variable i will become unreachable and the GC will reclaim the heap memory corresponding to the variable.
If we switch to a closure function, such as the foo function earlier, we are likely to use it in this way.
|
|
In the example here, the local variables of the foo function are referenced whenever they are in the closure function. This suddenly reminds me of the “functions are also first-class citizens feature in Go”. Could it be that the closure function is an object that references the local variables of the foo function? Then how does the closure function refer to the local integer slice variable i of the foo function on the memory layout? What is the closure function mapped to in the memory layout?
If a programming language has no restrictions on the creation and use of certain language elements, and we can treat such syntax elements like value, then we call such syntax elements “first-class citizens” of the programming language.
2. Go closure function objects
To answer this question, we have to ask Go assembly for help. We generate the above assembly code for closure1.go (we use go compiler version 1.16.5).
|
|
In the assembly code, we find the assembly code corresponding to the creation of a closure function at line 7 in closure1.go.
|
|
The assembly is always obscure. Let’s focus on the first line.
|
|
We see that line 7, which corresponds to the creation of the closure function in the Go source code, this line of assembly code roughly means to put the address of a structure object into CX. let’s extract this structure object.
Where does this structure object come from? Apparently the Go compiler created it based on the “characteristics” of the closure function. The F is the address of the closure function itself, which after all is a function, and should be in the same memory area as a normal function (like the read-only data area of rodata), but what about the integer slice variable i? Is this the local variable i of the Foo function that is captured by the closure function. Exactly. If you don’t believe me, we can define another closure function that captures more variables to verify it.
Here is the generator function for a closure function that captures 3 integer variables.
The structure of that closure function in its corresponding assembly code is as follows.
|
|
Extract the structure, i.e.
At this point, we have confirmed that it is the closure function itself that references the local variables of the wrapped function, i.e., the closure function structure object that the compiler has created in memory for it. With the unsafe package, we can even export this closure function object. Let’s try it with closure2.go as an example, as shown in the following code.
|
|
In the above code, we refer to the assembly output to define the closure structure to correspond to the closure function object in memory (each closure object is different, a trick is to refer to the assembly output to define the object), through the address conversion of unsafe, we map the closure object in memory to the closure structure instance. Running the above program, we can get the following output.
In the above example, the closure function captures external variables a, b and c, which are essentially referenced by a closure memory object created by the compiler. When we call the foo function, the closure function object is created (its address is assigned to the variable f). Thus, the f object keeps referring to variables a, b, and c. Only when f is reclaimed will a, b, and c be reclaimed as unreachable.
If we simply perform read-only operations on captured external variables in a closure function, the closure function object will not store pointers to these variables, but will only make a copy of the values. Of course, if a variable is captured by multiple closures created in a function, and some are read-only and some are modified, then the closure function object will still store the address of that variable.
Understanding the nature of closed functions makes it much easier to look at the question in the title of this article. The answer is that after the closure function object that captures the variable is reclaimed, if there are no other references to those captured variables, they will become unreachable and subsequently reclaimed by GC.
3. Summary
Let’s recall the quote from the beginning of the article about the Go language specification’s interpretation of closures: “They will continue to exist as long as they can be accessed”. It now appears that we can interpret this to mean that as long as the closure function object exists, those variables it captures will exist and will not be recycled .
This mechanism of the closure function dictates that we should always consider the possible “delayed recycling” of the variables captured by the closure function in our daily use. If there is a scenario where the variables referenced by the closure take up a lot of memory, and the closure function object is created in large numbers and executed with a long delay due to business needs (e.g. timer scenarios), this can lead to a high level of heap memory for a long time, and we need to consider whether the memory capacity can withstand such a level, and if not, we need to consider a different implementation.