In Go 1.19 development, string.SliceHeader
and string.StringHeader
went through a life-or-death struggle, and these two types were once marked as deprecated (deprecated
), but these two types are often used in scenarios where slice of byte and string are efficiently interchanged, so if they are marked as deprecated, but there is no alternative, they are removed from the deprecation mark, if nothing else. They will also be marked as deprecated again in Go 1.20, if nothing else.
Optimization of byte slice and string conversion
Directly converting string(bytes)
or []byte(str)
will bring data duplication and poor performance, so in the pursuit of the ultimate performance scenario, we will use the “hack” way to achieve these two types of conversion, such as k8s using the following way.
More often the following approach is used (rpcx also uses the following approach).
Even the standard library uses this approach.
Since the slice of byte and string data structures are similar, we can use this ‘hack’ to force the conversion. These two types of data structures are defined in the reflect
package.
Slice
has one more Cap
field than String
, and their data is stored in an array, and the Data
of both structures stores a pointer to this array.
The new way of Go 1.20
Many projects use the above approach for performance improvement, but this is achieved by unsafe
, which is quite risky because after a strong turn, the slice may make some changes, resulting in the relevant data being overwritten or recycled, and there are often some unexpected problems. I made a similar mistake when using this approach for RedisProxy, and I thought it was a standard library error at the time.
Therefore, Go is going to deprecate these two types SliceHeader
and StringHeader
in 1.20 to avoid misuse.
In Go 1.12, several methods String
, StringData
, Slice
and SliceData
have been added as replacements to achieve this high performance conversion.
func Slice(ptr *ArbitraryType, len IntegerType) []ArbitraryType
: Returns a Slice, whose underlying array starts at ptr, and whose length and capacity are both lenfunc SliceData(slice []ArbitraryType) *ArbitraryType
: Returns a pointer to the underlying arrayfunc String(ptr *byte, len IntegerType) string
: Generate a string, the bottom array starts from ptr, length is lenfunc StringData(str string) *byte
: Returns the array at the bottom of the string
These four methods look very low-level.
This commit was submitted by cuiweixie. Because it involves a very basic and low-level implementation, and because it is a method that is likely to be widely used, this commit was reviewed particularly carefully.
This change even alerted Rob Pike, who has been dormant for many months, to ask why there is only an implementation and not even a comment file: #54858. Of course, the reason is that this feature is still under development and review, but we can see that Rob Pike is very concerned about this change.
cuiweixie even modified the standard library to use these four methods from his commit unsafe
.
Performance testing
Although cuiweixie’s commit has not been merged to the master branch yet, there are still some variables. But I found that using gotip works with these methods now. My understanding is that gotip is consistent with the master branch, isn’t it?
Anyway, let’s write a benchmark:
|
|
The actual test results are as follows.
|
|
As you can see, without the way of hacking
, the time consumption of the two types of strong turn is very huge, if the way of reflect
is used, the performance improvement is greatly improved.
If we use the latest unsafe
package, the performance can also be greatly improved, although the time consumption than reflect
slightly increased, can be ignored.