After several years of work, the generic feature is finally going to follow the 1.18 release. This is a milestone. Considering that the original design document of Go generic is rather difficult and has a messy structure, I’ll compile my understanding into a document and share it with you today. Since there is a lot of content and my own understanding of the English design document is limited (especially the type derivation part), mistakes are inevitable. I welcome your readers to criticize and correct me by leaving comments.
Type parameters
Generic is called type parameters in Go. As we know, Go is a statically strongly typed language. We need to specify explicit types for variables when we write code. Generic programming, on the other hand, is exactly writing code that can be adapted to different types, so we need a way to describe different types.
The following is an example from the Go language generic design scheme.
The T
table is the type of the slice member, but the real internal type of T
is not determined at the time of defining Print()
and needs to be specified when the function is called. That is, we need to pass in an additional special parameter to specify the specific type of T
when we call the Print()
function. This special parameter is called a type parameter.
Since the function Print()
needs to receive type parameters, it has to declare the type parameters it needs. Thus, the syntax for declaring type parameters is
Go inserts a set of square brackets between the original function name and the list of function arguments to indicate type arguments. As with function arguments, we need to specify a “type” for each type argument, which the Go language calls a constraint. We’ll analyze this in more detail below. For now, all you need to know is that any
is a special constraint that indicates that the corresponding type parameter can accept any type, i.e., there is no constraint.
All type arguments need to be specified at the time of the function call, so we need another syntax, the example of which is as follows.
When calling the function Print()
, the Go language requires that square brackets be inserted before the function name and the argument list, and that the actual type of the type argument be specified in the square brackets. In the above example, since the actual type of the entry s is []int
, the type int
needs to be passed to the type parameter T
. If you want to print a floating-point slice, you can.
The type parameter can be used not only to declare the type of the function entry, but also to declare the type of the entry, e.g.
Use as follows.
A generic function can declare multiple type parameters, e.g.
|
|
In addition to generic functions, the Go language supports declaring type parameters in type definitions. For example.
If we need to save the int
element, we can define it like this.
|
|
Type definitions that declare type parameters are called Generic Types. For generic types, we can also define generic methods, e.g.
Since the type parameter is already specified when declaring v
, the call to the function Push()
eliminates the need to pass in the type parameter.
Both generic functions and generic types require specific type parameters to be passed in at the time of use. This is natural, yet cumbersome. For example.
For the convenience of developers who use generic functions or types, the Go language supports derivation (inference)** of the actual types of arguments by the actual types of the arguments passed in!
So, the above function call can be abbreviated as follows
The generic type can also be derived.
|
|
With generic derivation, developers can use generic functions and generic types just like normal functions or types, simply and clearly. This is remarkable design. But generic derivation is very complex, and we’ll cover it in detail later in this article.
By now, we’ve covered the most significant generic syntax of the Go language. A brief summary is as follows.
- Functions can define type arguments:
func F[T any](p T) { ... }
. - Types can be specified using type parameters in the argument declaration and in the function body.
- Type parameters can also be specified in type definitions:
type M[T any] []T
. - Type parameters must specify type constraints:
func F[T Constraint](p T) { ... }
. - Using a generic function or type requires passing a type parameter.
- The use of generics can be simplified by reducing the number of specified type arguments through generic derivation.
We discuss the constraints and generic derivation of generics in detail below.
Generic Constraints
We covered in the previous section that type constraints need to be specified for all type parameters. We also introduced that any
represents a special constraint that accepts all types.
So why do type parameters need constraints? Look at the following example.
Here the type parameter T
is bound to any
, so the elements of the slice s
can be of any type. This means that the elements of s
can have no String()
method. For example, let’s try to execute.
|
|
At this point, the actual type of T
is int
, so the type of s
is []int
, and thus the type of v
is int
. Obviously, int
has no String()
method and is bound to report an error!
So, for the Stringify
function, we need to restrict the scope of the type parameter T
. Specifically, we can only pass T
types that have a String()
method.
As another example.
If we want to determine the maximum of two integers, we can
|
|
But what if we want to compare two plurals?
|
|
Such a call will report an error. Why? Because complex numbers cannot be compared in size, so in Go complex(64|128)
does not support the comparison operator!
So, for the Max
function, we also need to restrict the range of T
values. Specifically, we can only pass T
types that support the >
operation.
This is the reason why we need to specify constraints on generic types. We need generic constraints to qualify the functions and operators supported by the type parameter.
One thing to be clear, however, is that adding a generic constraint does not mean that a compilation error will no longer be reported. If you pass the wrong type when calling a generic function, you will still get an error, but the compiler will explicitly report the error at the very beginning of the call, not when you get to the corresponding function body. If there is no constraint, then the compilation error reporting hierarchy can be very deep and extremely unfriendly for developers to troubleshoot errors (see C++’s template error reporting).
If a type constraint does not restrict the functions implemented by the type and also does not restrict the operators supported by the type, then it means that the corresponding type argument can accept any type. This particular constraint is any
.
The Go language can already use interface
to restrict the methods that need to be implemented. If you want to support all objects, you can use the infamous interface{}
. Because interface{}
is so stinky and long, Go has officially introduced the any
keyword to replace it. And we can now use any
to replace interface{}
in non-generic code, so we can look forward to that.
The Go team took a number of factors into account and decided to extend the existing interface
to support restricted operators when used as a generic constraint. The syntax is discussed in the following sections.
Before we dive into the discussion of generic constraints, we need to talk about the any
constraint.
any
constraint
Because there are no restrictions on types, we can only write code with a syntax that is supported by all types: the
- declare or define variables
- Assign values to each other before variables of the same type
- Use as a parameter or return value of a function
- Get the address of the corresponding variable
- Convert the corresponding variable to
interface{}
or assign it to a variable of typeinterface{}
. - Convert variables of type
interface{}
to variables of the corresponding type:t, ok := v.(T)
- Use the corresponding type in the switch type enumeration:
switch v.(type) { case T: /* ... */ }
- Construct composite types, such as
[]T
- pass to some built-in function, e.g.
p := new(T)
So any
is not as free as you want it to be, there is no absolute freedom.
Functions Constraints
Back to the Stringify
generic function above
We want all values of the type parameter T
to implement the String() string
function, so we can.
From form to content there is no difference from a normal interface. We can then modify the Stringify
function to
At this point, the following code will report a compile error.
|
|
Because the int
type does not implement the String() string
method, it cannot be passed to the type parameter T
.
Instead, the following method will work properly.
Instead of declaring the Stringer
interface separately, we can also write it as follows
The effect is the same. If we don’t want to put any restrictions on the scope of T
, we can write it as follows
|
|
Isn’t that a lot harder to read than func Stringify[T any](s []T) { ... }
is a lot harder to read?
Of course, we can also use someone else’s well-defined interface to restrict the type parameters, e.g.
The above is the main content of the function constraint, the following we discuss the operator constraints.
Operators Constraints
Returning to the previous example.
The function Max
requires that all T
s need to support the >
operator. How can we express this constraint? One way is to convert all operator operations into function calls, so that we can use the interface to constrain operator operations. But this approach is very complicated to implement. In the end, the Go language officially chose a less elegant but very easy to implement approach: type collections.
The Go language does not allow overloading operators. That is, only Go’s built-in objects can support operator operations. We can’t declare a struct
and then try to compare the size of the corresponding variables using >
. This makes it easier to restrict the operators to objects. Since the built-in types are limited, we can enumerate all the supported types.
If we want to restrict the range of type arguments to all Go’s built-in signed integers, we can.
So PredeclaredSignedInteger
only allows passing in the five built-in signed integer types, passing in any other type will report a compilation error.
We know that the Go language allows redefining its own types, e.g.
|
|
Although the underlying type of MyInt8
is still int8
, MyInt8
does not match the PredeclaredSignedInteger
constraint. But MyInt8
supports exactly the same operator operations as int8
, so we need a syntax to indicate that we can match both int8
and MyInt8
. So the Go language introduced the concept of Approximation Constraint, with the following syntax.
Note that ~
is added here before int8
to indicate an approximate match. This is fine as long as the underlying type is int8
. So MyInt8
can match the Int8
constraint.
With the approximation constraint, our expressiveness is instantly taken to the next level. All types that support comparison operators can be written as the following constraint.
We can rewrite the Max
function above as follows
This time we execute Max(1+2i, 3+4i)
again and it will trigger a compilation error.
Composite Constraints
In a generic constraint, we can enumerate not only possible basic or approximate types by |
, but also other constraints. I personally call this a composite constraint.
Concatenation Constraints
For example, the constraint that matches all signed integers is
The constraint to match all unsigned integers is
So the constraint to match all integers can be abbreviated as
The essence here is to use |
to denote the relation of a concatenation, the result of which is that Integer
can match the concatenation of all results matched by SignedInteger
and UnsignedInteger
.
Intersection Constraints
We can represent the intersection of two or more constraints, e.g.
Here we embed two constraints Integer
and Stringer
in the StringableInteger
constraint, representing the intersection of the two matching results. The types that conform to this constraint are not only integers in the underlying type, but also implement the String() string
method.
We can also just list the corresponding types and the list of functions that need to be implemented, e.g.
For some simple usage scenarios, we can even omit the interface
keyword. For example.
|
|
It can be directly simplified to.
|
|
Generic Constraints
The Go language also supports declaring type parameters in constraints! For example.
We need to specify specific type parameters for the constraints when we use them, e.g.
This example seems very complicated and redundant. It could have been written like this.
|
|
In fact, it is not. This involves the problem of generic derivation, which we will explain in detail later.
In generic constraints, we can also declare self-referential constraints. For example, the following example.
|
|
Mutual Constraints
The Go language not only supports defining type parameters in constraints, but also supports cross-referencing of constraints. The purpose is to solve more complex problems in practice, such as graph theory problems.
Taking graph theory as an example, if we want to write a series of graph theory algorithms, then we need both Edge
and Node
types.
- The
Node
type needs to implement theEdges() []Edege
method - The
Edge
type needs to implement theNodes() (Edge, Edge)
method
A graph can be represented as a []Node
. This is enough to implement the graph theory algorithm. The following code is a bit more cerebral, so please read it carefully against the comments.
|
|
The above is only the declaration part, continue to see the calling part below. First, define the specific structure.
The graph algorithm is then called.
Now let’s analyze the initialization process of the type parameters.
First, the type parameters Node
and Edge
of Graph
are replaced with *Vertex
and *FromTo
, respectively. Then the compiler starts checking for type constraints. For the Node
type, the constraint is NodeConstraint[*FromTo]
, so the Edges() []*FromTo
method needs to be implemented. And *Vertex
does implement the Edges
method. For the Edge
type, its constraint is EdgeConstraint[*Vertex]
, so it needs to implement the Nodes() []*Vertex
method, which is obviously also implemented by *FromTo
. At this point, the constraint checking ends.
So, when we call g.ShortestPath(a, b)
, the type of edges
is []*FromTo
!
Some people may say that this way of writing is too complicated and brain-burning, it is perfectly possible to simplify NodeConstraint
and EdgeConstraint
to.
But this would require modifying the function definitions of Vertex
and FromTo
.
But the result is that calling the Edges
function returns an abstract []EdgeInterface
slice rather than a concrete []*FromTo
list.
Built-in constraints
comparable
Earlier we said that the Go language only supports performing operator operations on built-in types. The exceptions are two operators, ==
and ! =
.
These two operators allow comparing user-defined struct objects, so they need to be handled separately.
The inconsistency of Go’s design can be seen here. On the one hand, they don’t want to introduce operator overloading, which would greatly increase the complexity of the Go language; on the other hand, they have to support equality comparisons for struct types. To do so, they had to automatically insert the equality comparison function during compilation and then “overload” the
==
and! =
operators.
For this reason, the Go language introduces a separate comparable
constraint for this two-operator.
That is, if we want the types to support equal comparisons, we can write it as follows.
constraints
For the convenience of developers, Go has a built-in constraints
package that provides commonly used type constraints.
|
|
The above basically covers most of the contents of generic constraints, and we start discussing generic derivation below.
Generic derivation (Type inference)
We have briefly introduced generic inference earlier, and its purpose is to simplify the use of generic functions/types as much as possible and reduce unnecessary type parameter passing.
This section analyzes the functionality and design of generic derivation in detail. Before we start, let’s look at the effect of generic derivation.
The Go language supports type derivation in the following cases.
If a generic function/type is used without specifying all type parameters, the compiler will try to derive the missing type. If the derivation fails, a compilation error is reported.
Go uses so-called type unification for type derivation. However, the original text is rather abstract, so this article focuses on concrete examples to illustrate how type derivation works.
Type unification
Assimilation is a comparison of two types to see if they are equivalent. Whether two types can be equivalent depends on.
- Whether the knot structure is the same, e.g.
[]int
is equivalent to[]T
(ifT
matchesint
), and not equivalent tomap[int]T
. - Whether the untyped variables have the same underlying type, e.g.
map[T]int
is not equivalent tomap[T]string
, but[]MyInt
is equivalent to[]int
.
For example, T1
and T2
are type parameters, and []map[int]bool
can be assimilated to the following types.
[]map[int]bool
T1
(T1
matches[]map[int]bool
)[]T1
(T1
matchesmap[int]bool
)[]map[T1]T2
(T1
matchesint
,T2
matchesbool
)
But []map[int]bool
is not equivalent to the following types.
int
struct{}
[]struct{}
[]map[T1]string
Function argument type inference
The function argument inference is divided into two stages.
The first stage skips all real parameters without type constants matching once. If there are still type arguments that have not been determined, the second stage will begin. At this point, all untyped constants need to be set to their corresponding default types, and then matched again. The same type parameter may be matched more than once, and if the result of multiple matches does not match, a compilation error will be reported.
Returning to the example we mentioned earlier.
|
|
can be simplified to Print([]int{1,2,3})
. Since the type of T
is not specified, the compiler performs type derivation.
The compiler compares the real reference type []int
with the formal reference type []T
. By definition of assimilation, the type of T
can only be int
, which gives the actual type of T
.
So the final function call is Print[int]([]int{1,2,3})
.
To analyze a more complex example.
The derivation process is as follows.
- Compare
[]int
and[]F
and infer thatF
is of typeint
- compare
strconv.Itoa(int) string
andfunc(F) T
, inferring thatF
isint
andT
isstring
(F
is matched twice, but both areint
) - The final inferred call is
Map[int,string]([]int{1,2,3}, strconv.Itoa)
All of the above entries have explicit types, and the cases where the entry has or does not have a type constant are discussed below.
For NewPair(1,2)
.
The first stage skips all untyped constants, so the type of T
is not deduced. The second stage sets the default type to int
for 1
and 2
. So the type parameter F
corresponds to int
and the function call is inferred as NewPair[int](1,2)
.
For NewPair(1,int64(2))
.
The first stage of the derivation ignores the untyped constant 1
. Since the second argument is of type int64
, it is inferred that the argument to F
is int64
. So the final function call is NewPair[int64](1,2)
.
For NewPair(1,2.5)
.
The first stage of the derivation ignores untyped constants. The second stage first sets 1
and 2.5
to default types int
and float64
. Then match from left to right. For parameter 1
it is confirmed that F
is int
and for parameter 2.5
it is determined that F
is float64
. The two results are not the same, so an error is reported.
After the type derivation is complete, the compiler still performs constraint checks and parameter type checks.
Constraint type inference
We said earlier that type parameter constraints can also use type parameters. For such structured constraints, we can infer their actual constraints from other type parameters or constraints.
The derivation algorithm is also rather verbose, so here are a few examples.
Example of element type constraint
Suppose we have the following functions.
With the above definition, if the function is called like the following.
The derived type of v1
is actually []int
, not MySlice
as we would like. Because the compiler replaced MySlice
with the underlying type []int
when comparing MySlice
and []E
, it deduces that E
is int
.
In order for Double
to return the type MySlice
normally, we rewrite the function as follows
The calling code needs to be changed to look like this.
|
|
We can also let the compiler automatically derive the type.
|
|
First, the compiler performs a function argument type derivation. At this point it is necessary to compare MySlice
with S
, but S
uses structural constraints, so its actual constraint type needs to be derived.
To do this, the compiler constructs a mapping table.
|
|
Then, the compiler expands the S
constraint, expanding SC[E]
into []E
. Since we previously documented the mapping of S
to MySlice
, we can compare []E
to MySlice
. And since the underlying type of MySlice
is []int
, it follows that the type of E
is int
.
|
|
Then, we replace all the E
s in the constraint with int
s to see if there are any more indeterminate type parameters. There are none left, so the derivation is over. So the original call is deduced as
|
|
The result returned is still MySlice
.
Example of pointer method constraints
Suppose we wish to convert a set of strings into a set of other types of data.
|
|
Let’s look at the calling code (which is faulty and does not compile).
The above code does not compile properly. The reason is that we specified the type T
as Settable
, but the type Settable
does not implement the Set(string)
method. It is type *Settable
that implements that method.
So we change the calling code to.
This time it compiles fine, but running the code gives an error again. This is because in FromStrings
, result[i]
is of type *Settable
and the value is nil
, so the assignment *p = *Settable(i)
cannot be performed.
So, for the FromStrings
defined above, we can neither set T
to Settable
, which would lead to a compilation error, nor to *Settable
, which would lead to a runtime error.
To implement FromStrings
, we need to specify both Settable
and *Settable
types, which requires the structuring constraint.
|
|
The calling code is as follows.
|
|
Repeating Settable
twice feels a bit silly and can be simplified to
The whole process of compiler derivation is like this. First, the mapping table is constructed from the known types.
|
|
Then replace T
with Settable
and expand all structured parameters. So the type Setter2[T]
of PT
is expanded to *T
and added to the mapping table.
|
|
Then replace all T
with Settable
to get.
|
|
This is the end of the derivation, the actual function call is FromStrings2[Settable,*Settable]([]string{"1", "2"})
.
Constraint checks after derivation
In this example, the compiler can derive the type of PT
as *Unsettable
. After the derivation, the compiler continues to check the Setter2
constraint. But *Unsettable
does not implement the Set
method, so it will report a compilation error.
Summary
The above is the main content of Go generic design, the main points are as follows.
- Functions can define type parameters:
func F[T any](p T) { ... }
. - Types can be specified in parameter declarations and function bodies using type parameters.
- Type parameters can also be specified in type definitions:
type M[T any] []T
. - Type parameters must specify type constraints:
func F[T Constraint](p T) { ... }
. - Using generic functions or types requires specifying type parameters.
- Specifying type parameters can be reduced by generic derivation, simplifying the use of generics.