As we all know, the design of gorourtine is a core component of the concurrent implementation of the Go language, easy to get started, but also encounter a variety of difficulties, of which goroutine leakage is one of the major problems, and its appearance often requires a long time to troubleshoot. Some people say you can use pprof to troubleshoot, but although it can serve the purpose, these performance analysis tools are often used to help troubleshoot problems after they occur.
Is there a tool that can prevent problems before they happen? Of course there is. The goleak open-sourced by the Uber team can be used to detect goroutine leaks and can be combined with unit tests to prevent them before they happen.
goroutine leaks
I don’t know if you have ever encountered goroutine leaks in your daily development, goroutine leaks are actually goroutine blocking, these blocking goroutines will live until the end of the process, they occupy the stack memory has been unable to release, thus leading to the system’s available memory will be less and less, until the crash! To briefly summarize a few common causes of leaks.
- The logic inside Goroutinegoes into a dead cycle and keeps taking up resources
- Goroutineis used in conjunction with- channel/- mutexand keeps getting blocked due to improper usage.
- The logic inside Goroutinewaits for a long time, causing the number ofGoroutinesto skyrocket
Next, we use the classic combination of Goroutine + channel to demonstrate goroutine leakage.
This example is channel forget to initialize, both read and write operations will cause blocking, this method if the unit test is written is not to check the problem.
Results of the run.
The built-in test cannot be satisfied, so next we introduce goleak to test it.
goleak
Use goleak mainly focus on two methods can: VerifyNone, VerifyTestMain, VerifyNone is used for testing in a single test case, VerifyTestMain can be added in TestMain, can reduce the invasion of the test code, examples are as follows.
Use VerifyNone:
Results of the run.
|  |  | 
See the specific code segment where the goroutine leak occurred by running the results; using VerifyNone will be invasive to our test code and can be integrated into the test faster by using the VerifyTestMain method.
Results of the run.
|  |  | 
The result of VerifyTestMain is a little different from VerifyNone, VerifyTestMain will report the test case execution result first, and then report the leak analysis, if there are multiple goroutine leaks in the test case, it is not possible to pinpoint the specific test where the leak occurred, you need to use the following script for further analysis.
|  |  | 
This will print out exactly which test case failed.
goleak implementation principle
From the VerifyNone portal, we look at the source code, which calls the Find method.
|  |  | 
Let’s look at the filterStacks method.
|  |  | 
The main purpose here is to filter out some goroutine stacks that are not involved in the detection, and if there are no custom filters, the default filters are used.
|  |  | 
As can be seen here, the default detection 20 times, each default interval 100ms; add the default filters;
To summarize the principle of goleak implementation.
Use the runtime.Stack() method to get the stack information of all goroutines currently running, define the filter items that do not need to be detected by default, define the number of detections + detection interval by default, and keep detecting in cycles, and finally determine that no goroutine leak has occurred if the remaining goroutine is not found after multiple checks.
Summary
In this article we have shared a tool that can find goroutine leaks in tests, but it still requires complete test case support, which shows the importance of test cases.