Go 1.18 introduced the new library netaddr
to represent IP addresses and related operations. Its author, Brad Fitzpatrick, wrote a special blog about the design principles and final implementation of this library.
The main feature of this implementation relies on the library intern.Value. Here are some of my research and observations on this library
The design principle of netaddr
is to have a type that can support IPv4, region-free IPv6 and region-free IPv6 at the same time, and to have a value type that can be compared correctly using ==
and with the smallest possible memory footprint. This is a very difficult requirement. You can refer to the library author’s blog for the detailed design process.
The final implementation results.
where addr
is used to hold the actual IP address (in the case of IPv4, only the lower 32 bits are used), and z
is used as a flag bit to distinguish between IPv4, region-free IPv6, and region-capable IPv6, as well as to record region information. Since the area information can be any string, a correct implementation requires that intern.Value
points to the same address when the string has the same content.
Here z
is not used as a string, I guess to try to compress the size of the IP structure. go a string will take up a fixed 16byte (an internal pointer to []byte, an int table is the length of the string), which is twice as big as a pointer 8byte. But using strings would make the implementation easier to understand.
Besides the 8byte more than the original structure, it also achieves the rest of the goals.
Here’s a look at how intern.Value
achieves the same functionality while saving 8byte. According to the function points to the same address when strings with the same content
, a very straightforward implementation would look like this.
Without considering concurrency, the biggest problem with this implementation is memory leaks. All pointers returned by Get are persistently referenced by values. To solve the memory leak, you need to bring out the unsafe library. This is very close to the implementation of intern.Value
.
|
|
valMap
does not reference Value
, it just records the address of Value
in unsafe.Pointer
. When all external references to Value
expire, the GC process triggers finalize
to do the check. If Value
has not been referenced after two rounds of finalize
, the corresponding record address is removed from valMap
. Value
will be deleted in the next GC process (since there is no finalize
attached this time).
If you add concurrency-protected locks, it’s pretty much the same as the implementation of intern. Value
also takes into account the case of non-string values.
The reason it is so problematic here is that ==
can only do one level of instance value comparison and is not customizable. This unsafe exchange is probably tolerable considering the problems associated with customizing ==
.
One vulnerability is that if an external program also records the address of a Value
via unsafe.Pointer
, it is possible that after some time the address of the Value
with the same content will change.
To be honest, I don’t like the implementation of intern.Value
. Maybe the underlying library really lacks the 8byte size.