Goroutine
The threshold for using Goroutine is really low, and there are a lot of abuses.
Goroutine Leak
The causes of Goroutine leaks are usually:
- Read/write operations such as
channel/mutex
are being performed insideGoroutine
, but due to logic problems, they are blocked all the time in some cases. - The business logic within the Goroutine enters a dead loop and resources are never released.
- The business logic within the Goroutine goes into a long wait, with new Goroutines constantly being added to the wait.
Improper use of channel
Goroutine+Channel is the most classic combination, so many leaks occur here.
The most classic one is the above mentioned logic problem when the channel performs read and write operations.
Send not receive
First example:
|
|
Output results:
In this example, we call the queryAll method multiple times, and we call the query method in a for loop using Goroutine. The point is that the result of the query method call is written to the ch variable, and the ch variable is returned after a successful reception.
Finally, we can see that the number of output goroutines is increasing, 2 more each time. That is, each time it is called, it leaks a goroutine.
The reason for this is that the channels are sent (3 at a time), but not fully received at the receiving end (only 1 ch is returned), which induces a Goroutine leak.
Receive not send
Second example:
Output results:
|
|
In this example, it is the opposite of “send but not receive”, where the channel receives the value but does not send it, which also causes blocking.
But in a real-world business scenario, it’s generally more complicated. Basically, it’s a bunch of business logic, and one channel has a problem with reading or writing, so it naturally blocks.
nil channel
Third example:
Output results:
|
|
In this example, you can learn that a channel will block if you forget to initialize it, regardless of whether you are reading, or writing.
The normal way of initialization is:
Call the make
function to initialize.
Strange slow wait
Fourth example:
Output results:
In this example, a classic accident scenario in the Go language is shown. That is, we would normally go to call the interface of a third-party service in our application.
The third-party interface, however, can sometimes be very slow and not return a response for a long time. As it happens, the default http.Client in Go does not set a timeout.
So it keeps blocking and Goroutine naturally keeps spiking and leaking, eventually filling up resources and causing accidents.
In Go projects, we generally recommend setting a timeout for at least http.Client:
And do measures such as flow restriction and fusing to prevent sudden flows from causing dependency collapse.
Forget to unlock
Fifth example:
Output results:
In this example, the first mutex sync.Mutex is locked, but it may be working on business logic or it may have forgotten to unlock it.
Mutex tried to lock, but all the subsequent sync.Mutexes blocked because they were not released. In general, in Go projects, we recommend the following:
Improper use of sync lock
Sixth example:
In this example, we call the synchronization orchestration sync.WaitGroup to simulate the control variables that we would pass in from the outside for loop traversal.
However, because the number of wg.Add does not match the number of wg.Done, it keeps blocking and waiting after calling the wg.Wait method.
For use in a Go project, we would recommend writing it as follows:
Verification method
We can call the runtime.NumGoroutine method to get the number of Goroutine runs, and compare them before and after to know if there is a leak.
However, in business service scenarios, most of the leaks caused by Goroutine are in production and test environments, so it is more common to use PProf:
As long as we call http://localhost:6060/debug/pprof/goroutine?debug=1
, PProf will return a list of all Goroutines with stack traces.
Reference https://eddycjy.com/posts/go/goroutine-leak/