When a service requests a resource, if it encounters a network exception or other situation that causes the request to fail, it needs a retry mechanism to continue the request. A common practice is to retry 3 times and sleep randomly for a few seconds. For business development scaffolding, the HTTP Client basically encapsulates the retry method and automatically retries when the request fails according to the configuration. Here is an example of a common HTTP Client to see how it implements request retry. Finally, the implementation of some other retry mechanisms are organized.
Implementation of go-resty retry mechanism
Let’s look at the implementation of go-resty to request a retry when sending HTTP requests.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
// Execute method performs the HTTP request with given HTTP method and URL
// for current `Request`.
// resp, err := client.R().Execute(resty.GET, "http://httpbin.org/get")
func (r *Request) Execute(method, url string) (*Response, error) {
var addrs []*net.SRV
var resp *Response
var err error
if r.isMultiPart && !(method == MethodPost || method == MethodPut || method == MethodPatch) {
return nil, fmt.Errorf("multipart content is not allowed in HTTP verb [%v]", method)
}
if r.SRV != nil {
_, addrs, err = net.LookupSRV(r.SRV.Service, "tcp", r.SRV.Domain)
if err != nil {
return nil, err
}
}
r.Method = method
r.URL = r.selectAddr(addrs, url, 0)
if r.client.RetryCount == 0 {
resp, err = r.client.execute(r)
return resp, unwrapNoRetryErr(err)
}
attempt := 0
err = Backoff(
func() (*Response, error) {
attempt++
r.URL = r.selectAddr(addrs, url, attempt)
resp, err = r.client.execute(r)
if err != nil {
r.client.log.Errorf("%v, Attempt %v", err, attempt)
}
return resp, err
},
Retries(r.client.RetryCount),
WaitTime(r.client.RetryWaitTime),
MaxWaitTime(r.client.RetryMaxWaitTime),
RetryConditions(r.client.RetryConditions),
)
return resp, unwrapNoRetryErr(err)
}
|
Retry flow
Sort out the retry flow of Execute(method, url)
at request time.
- If no retry count is set, execute
r.client.execute(r)
: request Request directly, return Response and error.
- If
r.client.RetryCount
is not equal to 0, execute Backoff()
function.
- The
Backoff()
method takes a handler argument, makes attempt network requests according to the retry policy, and takes function arguments such as Retries(), WaitTime()
.
The Backoff function
Focus on what the Backoff()
function does.
The Backoff()
code is as follows.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
|
// Backoff retries with increasing timeout duration up until X amount of retries
// (Default is 3 attempts, Override with option Retries(n))
func Backoff(operation func() (*Response, error), options ...Option) error {
// Defaults
opts := Options{
maxRetries: defaultMaxRetries,
waitTime: defaultWaitTime,
maxWaitTime: defaultMaxWaitTime,
retryConditions: []RetryConditionFunc{},
}
for _, o := range options {
o(&opts)
}
var (
resp *Response
err error
)
for attempt := 0; attempt <= opts.maxRetries; attempt++ {
resp, err = operation()
ctx := context.Background()
if resp != nil && resp.Request.ctx != nil {
ctx = resp.Request.ctx
}
if ctx.Err() != nil {
return err
}
err1 := unwrapNoRetryErr(err) // raw error, it used for return users callback.
needsRetry := err != nil && err == err1 // retry on a few operation errors by default
for _, condition := range opts.retryConditions {
needsRetry = condition(resp, err1)
if needsRetry {
break
}
}
if !needsRetry {
return err
}
waitTime, err2 := sleepDuration(resp, opts.waitTime, opts.maxWaitTime, attempt)
if err2 != nil {
if err == nil {
err = err2
}
return err
}
select {
case <-time.After(waitTime):
case <-ctx.Done():
return ctx.Err()
}
}
return err
}
|
Sort out the flow of the Backoff()
function.
Backoff()
receives a handler function and an optional Option function (retry optione) as arguments
- default policy 3 retries, customize retry policy by Step 1 preset Options.
- set the repsonse and error variables of the request
- start the
opts.maxRetries
HTTP request:
- execute the handler function (initiate the HTTP request)
- If the return result is not empty and the context is not empty, keep the request context for repsonse. If the context is wrong, exit the
Backoff()
process
- execute
retryConditions()
, set the conditions to check for retry.
- determine whether to exit the process based on needsRetry
- calculate the duration by
sleepDuration()
(based on the request resp, wait time configuration, maximum timeout and the number of retries to calculate sleepDuration. Time algorithm is relatively complex, refer to: Exponential Backoff And Jitter)
- waitTime for the next retry. If the request completes, exit the process.
A simple demo
See the request for a specific HTTP Client (with a simple wrapper).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
func getInfo() {
request := client.DefaultClient().
NewRestyRequest(ctx, "", client.RequestOptions{
MaxTries: 3,
RetryWaitTime: 500 * time.Millisecond,
RetryConditionFunc: func(response *resty.Response) (b bool, err error) {
if !response.IsSuccess() {
return true, nil
}
return
},
}).SetAuthToken(args.Token)
resp, err := request.Get(url)
if err != nil {
logger.Error(ctx, err)
return
}
body := resp.Body()
if resp.StatusCode() != 200 {
logger.Error(ctx, fmt.Sprintf("Request keycloak access token failed, messages:%s, body:%s","message", resp.Status(),string(body))),
)
return
}
...
}
|
According to the go-resty request process sorted out above, since RetryCount
is greater than 0, a retry mechanism is performed with a retry count of 3. Then request.Get(url)
enters the Backoff()
process, where the boundary condition for retry is: !response.IsSuccess()
, until the request succeeds.
Some other implementations of retry mechanisms
As you can see, go-resty’s retry policy is not very simple, it is a well-developed, customizable mechanism that takes HTTP request scenarios into account, and its business attributes are relatively heavy.
Let’s take a look at two common implementations of Retry.
Implementation 1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
// retry retries ephemeral errors from f up to an arbitrary timeout
func retry(f func() (err error, mayRetry bool)) error {
var (
bestErr error
lowestErrno syscall.Errno
start time.Time
nextSleep time.Duration = 1 * time.Millisecond
)
for {
err, mayRetry := f()
if err == nil || !mayRetry {
return err
}
if errno, ok := err.(syscall.Errno); ok && (lowestErrno == 0 || errno < lowestErrno) {
bestErr = err
lowestErrno = errno
} else if bestErr == nil {
bestErr = err
}
if start.IsZero() {
start = time.Now()
} else if d := time.Since(start) + nextSleep; d >= arbitraryTimeout {
break
}
time.Sleep(nextSleep)
nextSleep += time.Duration(rand.Int63n(int64(nextSleep)))
}
return bestErr
}
|
Each retry waits a randomly extended amount of time until f()
completes or until there are no more retry attempts.
Implementation 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
func Retry(attempts int, sleep time.Duration, f func() error) (err error) {
for i := 0; ; i++ {
err = f()
if err == nil {
return
}
if i >= (attempts - 1) {
break
}
time.Sleep(sleep)
}
return fmt.Errorf("after %d attempts, last error: %v", attempts, err)
}
|
The number of retries for the function is attempts
, each time waiting for the sleep time until f()
finishes executing.