Introduction
Prior to 1.14 of Go, preemptive scheduling was collaborative and required self-initiated ceding of execution, but this could not handle edge cases that could not be preempted. Some of these problems were not solved until 1.14 by signal-based preemptive scheduling, such as for loops or garbage collection of long-running threads.
Here is an example to verify the difference in preemption between 1.14 and 1.13.
|
|
This example traces the calls made during execution via go trace. Specify runtime.GOMAXPROCS(1)
in the code to set the maximum number of CPU cores that can be used simultaneously to 1, using only one P (processor), thus ensuring a single-processor scenario. Then a for loop is called to open 10 goroutines to execute the func function, which is a purely computational and time-consuming function that prevents the goroutines from being idle and giving way to execution.
Here we compile the program and analyze the trace output.
Then we get the trace.output file and visualize it.
|
|
Go1.13 trace analysis
From this figure above, we can see that.
- there is only one Proc0 in the PROCS column because we have limited it to one P.
- we started 30 goroutines in the for loop, so we can count the color boxes in Proc0 and there are exactly 30 of them.
- the 30 goroutines in Proc0 are executed serially, one after the other, without preemption.
- click on the details column of a goroutines to see that the Wall Duration is about 0.23s, which means that the goroutines have been executed continuously for 0.23s, and the total execution time of 30 goroutines is about 7s.
- cut to the call stack Start Stack Trace is main.main.func1:20, above the code is the func function execution header:
go func()
. - cut out the call stack End Stack Trace is main.main.func1:26, in the code is func function last execution print:
fmt.Println("total:", t)
;.
As you can see from the trace analysis above, Go’s collaborative scheduling does nothing for the calcSum function; once execution starts, you have to wait for it to finish. Each goroutine takes as long as 0.23s and cannot seize its execution rights.
As you can see from the trace analysis above, Go’s collaborative scheduling does nothing for the calcSum function; once execution starts, you have to wait for it to finish. Each goroutine takes 0.23s, and it can’t seize its execution.
Go 1.14+ trace analysis
After Go 1.14, signal-based preemptive scheduling was introduced. From the above diagram, you can see that the Proc0 column is densely packed with calls to goroutines when switching, and there is no longer a situation where goroutines can only wait for execution to end once execution starts.
The above run time is about 4s this case can be ignored, because I am running on two machines with different configurations (mainly because I am too much trouble to find two identical machines).
Let’s take a closer look at the breakdown.
This breakdown shows that
- this goroutine ran for 0.025s before giving way to execution
- the start stack trace is main.main.func1:21, the same as above.
- cut away the call stack End Stack Trace is runtime.asyncPreempt:50, which is the function executed when the preempt signal is received, and it is clear from this place that it is preempted asynchronously.
Analysis
Preemption signal installation
runtime/signal_unix.go
Register the SIGURG
signal handler function runtime.doSigPreempt
in runtime.sighandler
when the program starts.
initsig
|
|
The initsig function iterates through all the semaphores and then calls the setsig function to register them. We can look at the global variable sigtable to see what information is available.
|
|
The specific meaning of the signals can be found in this introduction: Unix Signals
Note that the preemption signal here is _SigNotify + _SigIgn
as follows.
|
|
Let’s look at the setsig function, which is inside the runtime/os_linux.go
file.
setsig
|
|
Note here that when fn equals sighandler, the function called is replaced with sigtramp. The sigaction function calls the system call functions sys_signal as well as sys_rt_sigaction to implement installation signals under Linux.
Implementing preemption signals
Here is the signal processing when the signal occurs, originally it should be after sending the preemption signal, but here I first went down the installation signal first. You can jump to after sending the preemption signal and come back.
The above analysis shows that when fn is equal to sighandler, the function called will be replaced with sigtramp, which is an assembly implementation, as we see below.
src/runtime/sys_linux_amd64.s
:
This will be called to indicate that the signal has been sent in response, and runtime-sigtramp
will do the processing of the signal. runtime-sigtramp
will then go on to call runtime-sigtrampgo
.
This function is in the runtime/signal_unix.go
file.
sigtrampgo&sighandler
|
|
The sighandler method does a lot of other signal handling inside, we only care about the preemption part of the code, where the preemption will eventually be performed through the doSigPreempt method.
This function is in the runtime/signal_unix.go
file.
doSigPreempt
|
|
function handles the preempt signal, gets the current SP and PC registers and calls ctxt.pushCall
to modify them, and calls the asyncPreempt function in runtime/preempt.go
.
The assembly code for asyncPreempt is in src/runtime/preempt_amd64.s
, which saves the user state registers and then calls the asyncPreempt2 function in runtime/preempt.go
.
asyncPreempt2
This function will get the current G and determine the preemptStop value of the G. The preemptStop will mark the _Grunning
state of the Goroutine as preemptable when the suspendG function of runtime/preempt.go
is called gp.preemptStop = true
, indicating that the G can be preempted.
Let’s look at the preemptPark function in runtime/proc.go
, which is called to execute the preempt task.
preemptPark
|
|
preemptPark modifies the status of the current Goroutine to _Gpreempted
, calls dropg to let out the thread, and finally calls the schedule function to continue the task loop scheduling of the other Goroutines.
gopreempt_m
The gopreempt_m method is more of an active cession than a preemption, and then rejoins the execution queue to wait for scheduling.
|
|
Preemption signal sending
Preemption signaling is performed by preemptM.
This function is in the runtime/signal_unix.go
file.
preemptM
preemptM This function calls signalM to send the _SIGURG
signal installed at initialization to the specified M.
The main places where preemptM is used to send preemption signals are as follows.
- the Go backend monitor runtime.sysmon detects timeouts to send preempt signals.
- the Go GC stack scan sends a preempt signal.
- calling preemptall when Go GC STW to preempt all P’s and make them pause.
Go background monitoring execution preemption
System Monitor runtime.sysmon
calls runtime.retake
in a loop to seize a processor that is running or in a system call, and this function traverses the global processor at runtime.
The main reason for system monitoring by preemption in a loop is to avoid starvation caused by G taking up M for too long.
runtime.retake
is divided into two main parts.
- a call to preemptone to preempt the current processor.
- call handoffp to give up access to the processor; preempt the current processor
|
|
This process will get the current state of P. If it is in _Prunning
or _Psyscall
state, and 10ms have passed since the last trigger time, then preemptone will be called to send the preemption signal. preemptone has been discussed above, so we will not repeat it here.
call handoffp to give up access to the processor
|
|
This process determines if the state of P is in the _Psyscall
state, a judgment is made, and if one is not satisfied, handoffp is called to give up the use of P:
runqempty(_p_)
: determines whether P’s task queue is empty.atomic.Load(&sched.nmspinning)+atomic.Load(&sched.npidle)
: nmspinning indicates the number of G being stolen and npidle indicates the number of idle P. Determine if there is an idle P and a P that is being scheduled to steal G.pd.syscallwhen+10*1000*1000 > now
: determine if the system call time is longer than 10ms.
Go GC stack scan sends preemption signal
Go scans the stack of G when marking the GC Root at GC time, and calls suspendG to suspend the execution of G before scanning, and then calls resumeG again to resume execution after scanning.
This function is in: runtime/mgcmark.go
:
markroot
|
|
markroot switches to G0 before scanning the stack and then calls suspendG to determine the running state of G. If the G is running _Grunning
, then it sets preemptStop to true and sends a preempt signal.
This function is in: runtime/preempt.go
:
suspendG
|
|
For the suspendG function I only truncated the processing of G in the _Grunning
state. This state sets preemptStop to true, and is the only place where it is set to true. preemptStop is related to the execution of the preempt signal, so if you forgot, you can go to the asyncPreempt2 function above.
Go GC StopTheWorld preempts all P
Go GC STW is executed with the stopTheWorldWithSema function, which is in runtime/proc.go
:
stopTheWorldWithSema
|
|
The stopTheWorldWithSema function will call preemptall to send preemption signals to all P’s.
The file location of the preemptall function is in runtime/proc.go
:
preemptall
The preemptone called by preemptall takes the executing G in M corresponding to P and marks it as being preempted; finally it calls preemptM to send a preempt signal to M.
The file location of this function is in runtime/proc.go
:
preemptone
|
|
Wrap-up
Up to this point, we have taken a complete look at the signal-based preemption scheduling process. To summarize the specific logic.
- when the program starts, it registers the
_SIGURG
signal handler functionruntime.doSigPreempt
; - at this point an M1 sends an interrupt signal
_SIGURG
to M2 via the signalM function; - M2 receives the signal, the OS interrupts its execution code and switches to the signal handling function
runtime.doSigPreempt
; - M2 calls
runtime.asyncPreempt
to modify the execution context and re-enter the scheduling loop to schedule other Gs.