String Splicing Definition

Define a method that implements string splicing, with the following function signature prototype.

1
type concatFn func(...string) string

That is, pass in a series of strings and return the result of their stitching.

Way 1: Use the + operator

1
2
3
4
5
6
7
func ConcatWithAdd(strs ...string) string {
    var r = ""
    for _, v := range strs {
        r += v
    }
    return r
}

Obviously, the performance of this approach is poor.

1
2
3
4
type StringHeader struct {
    Len int
    Data uintptr
}

The underlying layer of string is a StringHeader. therefore, strings are immutable in golang, which leads to the summation process of a string creating a whole new variable to receive the spliced string. So it is not optimal in terms of number of memory allocations and time efficiency.

Way 2: Use fmt.Sprintf()

1
2
3
4
5
6
7
func ConcatWithSprintf(strs ...string) string {
    var r = ""
    for _, v := range strs {
        r = fmt.Sprintf("%s%s", r, v)
    }
    return r
}

fmt.Sprintf itself is used for formatting, so it does a lot more than just splicing, such as parsing placeholders. Although it uses an []byte array at the bottom to store the data, its overall performance is worse than using + directly. So fmt.Spritf() is not recommended when it comes to simply splicing strings.

Way 3: Use strings.Buider

1
2
3
4
5
6
7
func ConcatWithStringBuilder(strs ...string) string {
    var b strings.Builder
    for _, v := range strs {
        b.WriteString(v)
    }
    return b.String()
}

The underlying strings.Buider also uses []byte arrays, and it is exclusively responsible for splicing, so its performance in terms of time efficiency and memory operations is better. And the array is expanded by a fixed number of times (2x and 1.25x), so it can also largely reduce the number of memory requests in frequent splicing scenarios.

Way 4: Use strings.Join()

1
2
3
func ConcatWithStringsJoin(str ...string) string {
    return strings.Join(str, "")
}

The underlying strings.Join also uses the []byte array, but it also splices in an extra delimiter, so it will still splice in an extra empty string if the delimiter is not needed, so its performance is not as high as strings.Builder.

Way 5: Use bytes.Buffer

1
2
3
4
5
6
7
func ConcatWithByteBuffer(strs ...string) string {
    var b bytes.Buffer
    for _, v := range strs {
        b.WriteString(v)
    }
    return b.String()
}

The underlying bytes.Buffer is still a []byte array, but a big difference between it and strings.Builder is that when calling b.String(), it reapplies memory again, and then assigns the result to the newly applied memory. The String() method of strings.Buider returns the underlying []byte array directly, so this extra layer of operation in bytes.Buffer makes its performance inferior to that of strings.Buidler.

Performance Testing

From the above analysis of the underlying data types, we can roughly know the advantages and disadvantages of different splicing methods in string splicing scenarios.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
package main

import (
    "strconv"
    "testing"
)

func prepare() []string {
    var r = []string{}
    for i := 0; i < 1000; i++ {
        r = append(r, strconv.Itoa(i))
    }
    return r
}

func BenchmarkConcatWithAdd(b *testing.B) {
    b.StopTimer()
    r := prepare()
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        _ = ConcatWithAdd(r...)
    }
}

func BenchmarkConcatWithSprintf(b *testing.B) {
    b.StopTimer()
    r := prepare()
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        _ = ConcatWithSprintf(r...)
    }
}

func BenchmarkConcatWithStringBuilder(b *testing.B) {
    b.StopTimer()
    r := prepare()
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        _ = ConcatWithStringBuilder(r...)
    }
}

func BenchmarkConcatWithByteBuffer(b *testing.B) {
    b.StopTimer()
    r := prepare()
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        _ = ConcatWithByteBuffer(r...)
    }
}

func BenchmarkConcatWithStringsJoin(b *testing.B) {
    b.StopTimer()
    r := prepare()
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        _ = ConcatWithStringsJoin(r...)
    }
}

The benchmark test is executed and the results are as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Running tool: /usr/data/go1.17/go/bin/go test -benchmem -run=^$ -coverprofile=/tmp/vscode-goTqWRSX/go-code-cover -bench . demo

goos: linux
goarch: amd64
pkg: demo
cpu: Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz
BenchmarkConcatWithAdd-4             	    2076	    597540 ns/op	 1494032 B/op	     999 allocs/op
BenchmarkConcatWithSprintf-4         	    1528	   1122768 ns/op	 1526913 B/op	    2999 allocs/op
BenchmarkConcatWithStringBuilder-4   	  103244	     16315 ns/op	   10488 B/op	      12 allocs/op
BenchmarkConcatWithByteBuffer-4      	   77482	     27777 ns/op	   12464 B/op	       8 allocs/op
BenchmarkConcatWithStringsJoin-4     	   74232	     23012 ns/op	    3072 B/op	       1 allocs/op
PASS
coverage: 100.0% of statements
ok  	demo	9.102s

Conclusion

  • String is a read-only literal in Golang. To improve the performance of string splicing, we need to reduce its memory allocation number, memory allocation size, etc.
  • Directly using + to splice strings creates new strings frequently to save the result of one splice, and n splices require creating n new strings of increasing size, so its performance is low.
  • In addition to using + directly, others such as fmt.Sprintf(), strings.Builder, bytes.Buffer all use []byte arrays to store data, and the array-based expansion method can reduce the number of new memory requests during splicing and no extra duplicate space, thus improving the performance of splicing .
  • fmt.Sprintf and strings.Join both do a lot of extra work, such as placeholder parsing, delimiter splicing, etc., which are the reasons why it is less efficient than strings.Buidler.
  • The performance of strings.Buidler and byte.Buffer is relatively good, with a single job duty, so the performance is better. But bytes.Buffer will have one extra memory request and assignment job, so its performance is lower than strings.Builder.