This time together, we will understand common function calls, struct method calls, and closure calls in depth assembly from the stack perspective.
Preamble
Function call types
Functions in this article refer to any executable block of code in Go. As mentioned in Go 1.1 Function Calls, there are four types of functions in Go.
- top-level func
- method with value receiver
- method with pointer receiver
- func literal
top-level func is what we normally write as a normal function.
|
|
The method with value receiver & method with pointer receiver refer to the value receiver method and pointer receiver methods of the struct method.
A struct method can add new behavior to a user-defined type. The difference between it and a function is that a method has a receiver, add a receiver to a function and it becomes a method. The receiver can be either a value receiver
or a pointer receiver
.
Let’s take two simple constructs, Man
and Woman
, for example.
Say()uses a
pointer receiver;
(Woman) Say()is a
value receiver`.
The definition of function literal is as follows.
A function literal represents an anonymous function.
That is, it contains anonymous functions and closures.
The following analysis is also expanded according to these types.
Basics
On modern mainstream machine architectures (e.g., x86
), the stack grows downward. The stack grows downward from the high address to the status address.
Let’s look at the definition of the assembly function for plan9.
assembly function
Let’s take a look at the definition of the assembly function for plan9:
stack frame size: contains local variables and the space for additional function calls.
arguments size: contains the size of the arguments as well as the return value, e.g. if the input is 3 int64 types and the return value is 1 int64 type, then the return value is sizeof(int64) * 4.
Stack adjustment
Stack adjustment is achieved by performing operations on the hardware SP registers, for example:
Since the stack grows downward, SUBQ actually allocates stack frames for the function when it subtracts from SP, and ADDQ clears the stack frames.
Common instructions
Addition and subtraction operations.
Data handling.
Constants are denoted by $num in plan9 assembly, can be negative, and are decimal by default. The length of the carry is determined by the suffix of the MOV.
Another difference is that when using MOVQ you will see the difference between with and without parentheses.
Address arithmetic.
|
|
The 2 in the above code stands for scale, and scale can only be 0, 2, 4, or 8.
Function call analysis
Direct function calls
We define a simple function here.
Then use the command to print out the assembly.
|
|
Let’s look at the assembly instructions and the stack in sections. Let’s start with the main method call.
|
|
Here’s a look at what the above assembly does.
|
|
0x0000
: the offset of the current instruction relative to the current function.
TEXT
: since the program code is placed in the .text segment of memory during runtime, TEXT is an instruction to define a function.
"".main(SB)
: indicates the package name. SB is a virtual register that holds the static-base pointer, which is the start address of our program address space.
$32-0
:$32 table the size of the stack frame to be allocated; 0 specifies the size of the arguments passed in by the caller.
The FUNCDATA and PCDATA instructions contain information that is used by garbage collection; these instructions are added by the compiler.
|
|
(a) When executing a call on the stack, since the stack grows from the high bit of the memory address to the low bit, the call SUBQ $32, SP
is called based on the current stack frame size, indicating that 32bytes of stack memory is allocated.
Here 8 bytes (24(SP)-32(SP)) are used to store the current frame pointer BP.
parameter value 1 will be pressed onto the stack at position (0(SP)-8(SP)).
Parameter value 2 will be pressed onto the stack at (8(SP)-16(SP)).
It is important to note that our parameter type here is int, which is 8byte in size in 64-bit. Although the stack grows from the high address bit to the low address bit, the data block inside the stack is stored from the low address bit to the high address bit, and the location pointed to by the pointer is also the starting location of the low address bit of the data block.
In summary, in function calls, we can know two pieces of information about the passing of parameters.
- Parameters are passed entirely through the stack.
- Stacking from right to left of the argument list.
Here are the details of the call to the stack before the add function is called.
When we prepare the function entry, we call the assembly instruction CALL "".add(SB)
, which first stores the return address of main (8 bytes) on the stack, then changes the current stack pointer SP and executes the assembly instruction for add.
Let’s go to the add function.
|
|
Since the current stack pointer SP will be changed, let’s look at the data on the stack before we look at the assembly code of this function, here we can actually dlv it.
We can print the current Rsp and Rbp registers with regs when we get to the add function.
The difference between the address values of Rsp and Rbp is 24 bytes, which is consistent with our example above.
Then after entering the add function, we can use regs to print the current Rsp and Rbp registers.
The difference between the address values of Rsp and Rbp is 32 bytes, because the return address of the function (8 bytes) is pushed to the top of the stack when the CALL instruction is called.
At this time, the position of parameter value 1 and parameter value 2 will also change.
Parameter value 1, which was at position (0(SP)-8(SP)) on the stack, will be moved to position (8(SP)-16(SP)) on the stack.
the value of parameter 2, which would have been at position (8(SP)-16(SP)) on the stack, will be moved to position (16(SP)-24(SP)) on the stack.
We can also print out the parameter values via dlv.
The following are the call details of the call stack after the add function is called.
From the analysis of the add function call above we can also conclude that
- The return value is passed on the stack, and the stack space for the return value is before the arguments
After the call, we look at the return of the add function.
After the add function is called, the BP pointer is restored and then the ADDQ instruction is called to increase the value of SP and perform a stack shrink. From here, we can see that the final caller is responsible for the cleanup of the stack.
To summarize the following rules for stack calls.
- Parameters are passed entirely through the stack.
- Stacking from right to left of the argument list.
- return value is passed through the stack, and the stack space for the return value is before the arguments
- After the function call, the caller (caller) takes care of the stack cleanup
Structure methods: value receivers and pointer receivers
As we mentioned above, there are two types of method receivers in Go, a value receiver
and a pointer receiver
. Here’s an example to illustrate.
You can look at the manual assembly output yourself in conjunction with the article.
Calling the value receiver method
In assembly, our structure is actually a piece of contiguous memory at the assembly level, so p := Point{2, 5}
is initialized as follows.
Our struct Point here is composed of two int parameters, int is 8bytes on 64-bit machines, so here we use XORPS to initialize the 128-bit size X0 register first, and then use MOVUPS to assign the 128-bit size X0 to 24(SP) to request a 16bytes memory block. Then initialize the two parameters 2 and 5 of Point.
The next step is to initialize the variables and then call the p.VIncr
method.
Up to this point, the structure of the stack frame before the call probably looks like this.
Look again at the assembly code for p.VIncr
.
|
|
The structure of the stack frame after the call here will probably look like this.
From the above analysis we can see that the caller is actually assigning values on the stack to VIncr as parameters when calling the VIncr method, and the changes in VIncr are actually modifying the last two parameter values on the stack.
Calling the pointer receiver method
In main, the call is made with the following instruction
From the above assembly, we know that AX actually holds the address value of 24(SP), and the pointer stored in AX is also assigned to the first argument of SP. That is, both AX and SP’s first argument are the address value of 24(SP).
The entire stack frame structure should look like the following.
Look at the assembly code for p.PIncr
again.
|
|
In this method is actually a bit interesting and a bit roundabout, because a lot of places are actually operations on pointers, so that changes made by either side will affect the other side.
Here is a step-by-step analysis.
These two instructions assign the pointer stored in 8(SP) to AX and CX respectively, and then get the value from the AX memory address and write it to AX.
Here the 16(SP) parameter is added to AX, so the value stored in AX should be 12. Then AX is assigned to the value pointed to by CX’s memory address, and we know from the above assembly that CX points to the pointer stored in 8(SP), so the value pointed to by 8(SP) is also modified here.
We can verify this by using dlv to output regs.
We can then look at the values stored in 8(SP) and CX.
You can see that they all point to the same 32(SP) pointer.
Then we can print out the exact value of the pointer.
At this point, the stack frame looks like this.
Let’s move on to the next
Here the address values stored at 8(SP) are assigned to AX and CX.
Here we let the code run after the MOVQ "".p+8(SP), CX
execution line with the single-step step-instruction command, and then look at the AX pointer location.
|
|
You can see that AX and CX are pointing to the same memory address location. We then go to the following.
|
|
As mentioned earlier, for structures, the allocation is a continuous block of code, and 32(SP) to 48(SP) on the stack point to the structure instantiated by variable p. So in the above printout 824634074984 represents the value of variable p.X. Then the address value of p.Y is 824634074984+8
, and we can also print out the value represented by the address via dlv We can also print out the value represented by the address with dlv.
So MOVQ 8(AX), AX
actually does add 8 to the address value, then takes the result 5 and assigns it to AX.
The result is that AX is calculated to be equal to 15, and then the result is written to the space pointed to by the CX+8 memory address value, which also modifies the value pointed to by the pointer at 40(SP).
At the end of this method, the stack frame is as follows.
From the above analysis we can see an interesting thing, in the call to the pointer receiver (pointer receiver) method call, is actually the first copy of the structure pointer to the stack, and then in the method call is all based on the pointer operation.
Summary
We know that when we call the value receiver method, the caller writes the value of the argument to the stack, and the caller callee actually operates on the value of the argument on the caller’s stack.
The difference between the pointer receiver method and the value receiver method is that the caller writes to the stack the address of the parameter, so it is reflected in the receiver’s structure directly after the call.
Literal methods func literal
func literal I don’t know how to translate it exactly, let’s call it a literal method, in Go these methods mainly include anonymous functions and closures.
anonymous functions
I’m going to analyze this with a simple example.
Let’s look at its compilation below.
|
|
Through the above analysis I believe you should be able to see what this assembly is doing. The anonymous function actually passes the entry address of the anonymous function.
Closures
What are closures? Wikipedia describes closures in the following way.
a closure is a record storing a function together with an environment .
A closure is an entity consisting of a function and its associated reference environment.
I’ll do it here with a simple example.
Since closures are contextual, let’s take the test example where the variable x changes with each call to the f() function. But as we know from other method calls, if the variable is stored on the stack then the variable will expire with the exit of the stack frame, so the variable of the closure will escape to the heap.
We can perform an escape analysis to prove this.
You can see that the variable x escapes to the heap.
Let’s take a look at the assembly directly.
Let’s look at the main function first.
|
|
In fact, this assembly is the same as the assembly of other function calls, there is nothing to say, before calling the test function is to do some stack initialization work.
Here is a direct look at the test function.
|
|
Let’s look at this compilation step by step.
|
|
This step is actually writing the type.int function address value to (SP) via AX, then calling runtime.newobject to request a memory block, writing the memory address value to 24(SP) via AX is equivalent to allocating memory space to variable x, and finally setting the value of x to 100.
At this point the stack frame structure should look like this.
|
|
This structure represents a closure, and then the memory address of the created structure is placed in the AX register.
|
|
This assembly instruction then writes the memory address stored in AX to (SP).
Here a new block of memory will be requested and the memory address will be written from AX to 16(SP).
Here the test.func1 function address value is written to CX, and then the address value stored in CX is written to the memory pointed to by the memory address saved in AX. Then it also writes the address value saved by 16(SP) to AX, in fact, the value saved by AX does not change here, so I don’t know why it is necessary to generate an assembly instruction like this.
Since the AX memory address is written to 8(SP) and the 16(SP) memory address is written to AX, this one-time actually modifies the value in three places, and the specific stack frame structure is as follows.
|
|
24(SP) actually holds the pointer address of the x variable, which will be written to CX here. Then the value saved by 8(AX) will be transferred to DI, and finally the value saved by CX will be written to 8(AX).
Here is a little bit more about the reference of AX at this point.
AX -> address value of test.func1, that is, AX is pointing to the address value of test.func1 at this time.
8(AX) -> 24(SP) address value -> 100, that is, the address value saved by 8(AX) points to the 24(SP) address value, and the memory saved by the 24(SP) address value points to 100.
The value of 16(SP) will be written to the upper caller’s stack frame 48(SP) by AX, and then the stack will be shrunk and the callee stack call will be completed.
After the call, it returns to the main function, and the stack frame at this time is as follows.
Returning to the location of the main function after the test function call, the following
After the test function is called, a test.func1 function address value is returned to the top of the main call stack, and then the test.func1 function address value stored in (SP) is written to AX after the test function is called, and then the following instruction is executed to call.
|
|
Before we get to the test.func1 function, we should now know that (SP) holds the value of the address pointing to AX.
The test.func1 function is the function within the test function that wraps the return.
|
|
Since DX holds the AX address value, you can get the address value of variable x by 8(DX) and write it to AX. The ADDQ instruction is then called to add 100 to the value pointed to by the x address.
Summary
From the above analysis, we can find that anonymous functions are actually a kind of closure, just without passing variable information. In the case of a closure call, the context information is escaped to the heap to avoid being recycled at the end of the stack frame call.
In the above example of the closure function test call, it is very complicated to do a lot of variable passing, but it actually does these things.
- initialize the memory block for the context information.
- saving the address value of the context information to the AX register.
- write the address of the test.func1 call function wrapped in the closure function to the top of the caller’s stack.
The context information here refers to the x variable and the test.func1 function. After writing these two addresses to the AX register, go back to the main function, get the address of the function on the top of the stack, write it to AX and execute CALL AX
to call it.
Since the x variable address is written to AX + 8, the test.func1 function is called by getting the value of AX + 8 to get the x variable address to change the closure context information.
Summary
In this article, we first shared with you how the process of function calls works, including the passing of parameters, the order of parameter stacking, and the passing of function return values. Then we analyzed the difference between structured method passing and how closed function calling works.
The dlv tool’s regs command and step-instruction command help a lot when analyzing closures, otherwise it’s easy to get around the pointer passing between registers, so it’s recommended to draw on paper while watching.