Some time ago I drilled into How to master advanced TypeScript patterns this article. This is an earlier blog post by Pierre-Antoine Mills, author of ts-toolbelt. The article raises a challenging topic: How can TS write type support for collinear functions?
I did some practice with the original article, and then seemed to come to some knowledge about TS generics closer to the substance – from the collection perspective. Based on this knowledge, I found that most of the generalizations in the original article are written in a much tighter way. I thought this practice and thought process was worth documenting, hence this paper.
As with the original article, this article does not expand on the issues of currying or functional programming; currying is just material for discussing TS generic development. The clue to this article is the role of each generic type in my complete implementation, but that’s not the point, the point is the insight behind it – in the first half of the article, I often discuss a simple generic type at length, so please don’t skip it.
Propositions
Collocalization is an important concept in the field of functional programming, and represents the process of converting a multi-argument function into a function that “takes one argument at a time”, such as converting f(a,b,c)
to f(a)(b)(c)
. To give a more detailed example.
|
|
In its simplest form, the Currier can take only one argument at a time.
For advanced kriging, an indefinite number of parameters can be taken at a time.
Even residual parameters and placeholders can be supported.
The curried conversion function curry
is the core of currying. The conversion function takes a normal function toCurry
- temporarily denoted by the type Function
- and returns the curried function curried
(later also called curried function or curried function) - temporarily denoted by the type unknown
. - temporarily represented by the type unknown
. How to write a legal type expression for it is the main thread of this article.
CurriedV1: The simplest currying
The simplest, one-argument-at-a-time currying is implemented by me as follows.
|
|
Generic Length
The first generic Length
returns the length of a tuple.
|
|
The first thing to understand is that a type is a collection of objects . For example, the type number
is the set of all numbers, the type 1
represents a single set of elements consisting of the value 1
, and the type string[]
is the set of all “arrays where every item is a string”.
A generic, formally, is a function of a type (converting one type to another); from a collection perspective, it is a map of a collection (changing one collection to another).
The mapping of a set must be based on rules that act on the elements within the set. Suppose there are sets A and B. We can say that there is a mapping relationship between A and B only if any element in A, according to some rule, can be transformed into an element in B.
Since mapping can only be done from one set to another, how can we understand the case of multiple generic arguments? Answer: Think of multiple generic parameters as a tuple type.
Take Length
as an example.
Passing the type [unknown]
to Length
yields the type 1
, describing the fact that any of the countless elements belonging to the type [unknown]
, whether [1]
or ['foo']
or [{foo: true}]
, to which the length is applied, yields the result 1
. These elements are transformed into the value 1
by the rule Length
; in other words, the set represented by the type [unknown]
is mapped by the rule Length
to a single-element set containing only one element (that is, the value 1
), and the type of this set is the type 1
.
Passing the type unknown[]
to Length
yields the type number
, because any one of the countless elements of the type unknown[]
, whether []
or [1]
or ['foo', true]
, which is given a length, yields 0
, 1
or 2
, etc. etc., all of type number
. Note that Length
does not guarantee that the rule-transformed value occupies the mapped set: there is no array with length 0.5
or -1
. So the result of the generic operation length<unknown[]>
, number
, is actually a hyperset of the real-world mapped result set.
It is unavoidable that generics return a superset of the `real result set’, which is often broader than we would expect from a collection. From a collection perspective, the purpose of writing a generic type is to provide a superset of the “true result set” that can be described by type rules and at the same time as small as possible. It is important to understand this.
If JS supported unsigned integer types,
Length<unknown[]>
, it would seem to get the perfect result set, but this is really just a coincidence. More often than not, it is not possible to get a perfect result set: for example,Length<[unknown, . . unknown[]]>
needs to return a set “consisting of integers greater than 1”.
JavaScript: a priori knowledge
How does TS know that the result of Length<unknown[]>
is number
? Is there some principle between Length<unknown[]>
and number
that can still be understood? I think: there is no principle left, TS just gives a straightforward answer based on a priori (axiomatic) knowledge.
The type system of TS is tailored to JS: any JS literal is a single element type of TS; JS base types such as number
or string
also form the base types of TS; with a syntax similar to that used to define arrays, JSON objects, and functions, we can create array types and tuple types, object types, and function types to represent the types that contain the eligible TS is of course familiar with the habitat of all object types in JS - what member properties they have, how they transform into each other, etc. - this knowledge is a priori for TS, so TS can easily and correctly perform operations on base types.
Generic Head
The second generic Head
takes the type of the first element in the tuple T
.
|
|
Head
first determines whether T extends []
, if so, T
is a single-element set containing only the empty array and returns never
; otherwise, T
is not a single-element set of the empty array and may have the first element and returns T[0]
.
Why is there only the extends
keyword in the conditional generic, but not the equals
keyword or the ==
operator? My insight is that in set operations, only the inclusion operation can produce a yes
or no
result, while the other operations on sets: intersection, union, complement, mapping, they all result in another set. In the context of sets, A is contained in B, which means A is a subset of B. Switch to the context of types, which means A is a subclass of B, which means A inherits from B.
How can we tell that two types are identical? Just determine that they contain each other (inherit from each other).
To perform some tests on Head
.
The results of the first three tests are intuitive. The fourth one, when we pass a single-element set containing the empty array to Head
, yields a result of type never
, indicating the empty set, and there is nothing wrong with that either.
Let’s take another look at the second test: is the empty array an element of the string[]
set, please? Of course it is. So, in a real-world Head
mapping, to what elements is the empty array mapped?
In set theory, the premise of a mapping is that the mapping rules are valid for all elements in the source set. How should we understand Head
?
My understanding is that there is a never
object in TS that can’t be written out (not in JS), and a never
type that can be written out represents a single-element set containing a never
object. Also, any type that can be written out in TS implicitly contains never
objects, which makes any type that merges with a never
type get itself, thus making the otherwise single-element set of never
types on concept the empty set.
From this perspective, Head<string[]>
makes sense: the empty array elements in the string[]
set are mapped to never
, while the other elements are mapped to string
; since string | never
is still string
, it eventually returns string
.
Generic Tail
The third generic Tail
extracts the type of the tail items of the tuple T
(i.e., those remaining after the first item).
|
|
It’s a bit complicated.
Let’s start with a short version.
|
|
SimpleTail
is formally very similar to JS code: it uses the residual argument operator to extract the remainder of the tuple excluding the first item. A simple test is no problem either.
However, if you test it with string[]
, you get never
. That’s not quite right.
|
|
In the real world, almost all elements of a string[]
set (except for the empty array object) are meaningful for the tail operation. In fact, if we generalize by giving some examples, we must conclude that the result of taking the last term of string[]
is string[]
. However, according to the implementation of SimpleTail
: string[]
is indeed not [unknown, . . unknown[]]
, we can only return never
.
Let’s look at the formal version of Tail
.
|
|
- Branch 1: If
T
is a subset of the empty array single-element set, we can conclude thatT
can only be the empty array single-element set ornever
, which returnsnever
. - Branch 2: If
T
is a subset of “the set consisting of all “arrays with the first item””, we can conclude thatT
cannot contain empty array elements, and extract the type of the last item in a form similar to that inSimpleTail
. - Branch 3: If neither of the above is satisfied, we return
T
directly.
Passing string[]
to test this, we get the expected type: string[]
by branching 3.
|
|
Are you really sure? If T
does not satisfy branch 1 or branch 2, it must be a pure array type like string[]
or number[]
, and not anything else?
Let’s summarize the writing of array (including tuple) types: (we don’t care about the specific types of array items, we use unknown
instead)
- empty arrays:
[]
. - pure arrays:
unknown[]
; 3. - tuple:
[unknownA, unknownB, unknownC]
. - a tuple with remaining terms:
[unknownA, unknownB, . . unknownC[]]
.
After summarizing, we find that there are only 4 ways to write an array type, nothing else! The array types that can be written and the array types that can be calculated (returned by other generics) can all be grouped together in the end. These 4 ways of writing are the boundaries on how TS handles array types, in other words TS cannot produce array types that “cannot be combined out with these 4 ways of writing”.
Based on this knowledge, we are confident that only the second write method (pure array types) will make it to branch 3, and we can safely return direct T
in branch 3.
Note that there is a catch here. Consider the case of passing in a merge set.
|
|
According to set theory, the mapping of a concatenation should be done by mapping each of the individual sets that make up the concatenation, separately, and then taking the concatenation of multiple result sets as the final result.
Tail
does not disappoint us, it returns the expected type correctly. But this is conditional, the branch condition in the generic must satisfy the constraint Distribution condition type: i.e. the condition must be a generic argument directly extends
some type (in the form T extends SOMETYPE
), if we replace the first condition T extends []
in the Tail
implementation with Length<T> extends 0
, the constraint on the type of the distribution condition fails and the proposition “T
can only be one of these 4 ways” ceases to exist, and the building collapses.
Have you experienced a certain clumsiness of generic programming? The rules of set mapping (i.e., the semantics of generics) are based on elements within sets, but implementers of generics must answer the question “what set is after the mapping” based on the operations of the set itself. This needs to be summarized practically from a real-world perspective to guarantee the correctness and minimality of the mapping.
Types of conversion functions
Currently, the type of the curry conversion function curry
is defined as follows: takes an arbitrary function and returns unknown
.
We want to replace unknown
with a more refined type so that the user can get the correct type hint when using the result returned by curry
(i.e. the curry function).
Obviously, exactly what this “more refined type” is depends on the function that is passed in when calling curry
. We use generic constraints to extract the argument P
and the return type R
from the passed-in function.
|
|
Then, pass P
and R
to the Curried
generic type as the type of the curried function (i.e., the aforementioned “more refined type”).
Note that
Curry
is not a generic mapping, but simply a function type with generic constraints.
Generic type CurriedV1
CurriedV1
is the first version of the implementation of the Curried
generic type, which supports the simplest form of currying (consuming one argument at a time).
|
|
A generic type can be called recursively, and CurriedV1
is such that each time it calls itself recursively, the size of the tuple P
is reduced by one until it becomes an empty array, ending the recursion.
Test it out, it’s perfect.
|
|
You may ask: If you pass in an array type of infinite (unknown) length, like number[]
, will TS get stuck in a dead loop? Let’s try.
|
|
Instead of reporting an error or getting stuck in a dead loop, TS still maps a function type that can be called ad infinitum. So, we can conclude that progressive scaling down on recursion is not a necessary condition for generic recursion.
In fact, some inert mechanism of generics allows us to define types for things like loop reference objects in JS or functions that return themselves.
At this point, most of the “generalization from a collection perspective” insights have been stated. Next, I’ll speed things up a bit and finish up with the more advanced implementation of curried types.
CurriedV2: Allowing Indefinite Arguments
It would be easier to use if curried functions could take indefinite arguments (in the form of curried(1)(2,3)(4)
). My implementation is.
|
|
Generic Prepend
The generic Prepend
prefixes the tuple type with an item.
Note that Prepend
is not a conditional type and naturally does not satisfy the distribution condition type, so Prepend_Test3
is [1 | 2, . . 3[]]
instead of [1, . . 3[]] | [2, . . 3[]]
. If you want the latter, you can put the implementation of Prepend
inside the conditional type, as follows.
The subsequent discussion in this article assumes that all incoming types are non-discrete (i.e., non-concurrent in form) and does not discuss the issue of distributing conditional types.
The generic type Drop
The generic type Drop
is responsible for deleting the N
elements of the header from the tuple. Drop
is also recursive, deleting one element at a time recursively, while placing an unknown
into tuple T
. When the length of the tuple T
is equal to N
, enough elements have been removed to return the remaining elements.
Simply tested with no problems.
The key to Drop
is that an empty array, the third generic parameter T
, is used for counting.
Interestingly, a similar mechanism can be used to add and subtract integers.
1 2 3 4 5 6 7 8 9 10 11
type FromLength<N extends number, P extends unknown[] = []> = Length<P> extends N ? P : FromLength<N, Prepend<unknown, P>>; type Add<A extends number, B extends number, Res extends unknown[] = FromLength<A>, Count extends unknown[] = []> = Length<Count> extends B ? Length<Res> : Add<A, B, Prepend<unknown, Res>, Prepend<unknown, Count>>; type Sub<A extends number, B extends number, Res extends unknown[] = [], Count extends unknown[] = FromLength<B>> = Length<Count> extends A ? Length<Res> : Sub<A, B, Prepend<unknown, Res>, Prepend<unknown, Count>>; type Eight = Add<3, 5>; // => 8 type Four = Sub<9, 5>; // => 4
The generic PartialTuple
The story of the generic PartialTuple
starts with the official TS generic Partial
. We know that the Partial
generic can make all properties of an object type optional. When applied to an array, it makes each item of the array optional, e.g. Partial<[number, string]>
can be of type similar to [number?, string?]
.
We expect CurriedV2 to support indefinite arguments, so we need to extract the “first arbitrary term of a tuple” type from the tuple of indefinite arguments: for example, if the indefinite arguments are of type [1, 2, 3], then the indefinite arguments can be [1], [1, 2] or [1, 2, 3]. However, the current type operations of TS do not allow for such a mapping rule as “first arbitrary term of a tuple”, and Partial is the closest implementation (minimal superset).
Why do we need PartialTuple
again? Because the type converted by Partial
is no longer a tuple: properties like length
, map
, etc. become optional, which makes objects like {0: 'Hello'}
also in the set of Partial<[string]>
. The PartialTuple
excludes the elements that are not part of the tuple.
|
|
The original uses Partial
directly without reporting an error, which is a bug in TS: for Partial
after passing in a tuple type, it is inconsistent in determining whether it is still a tuple or not under different conditions. I submitted issue and Minimal Replication.
Pan type CurriedV2
CurriedV2
is somewhat similar to CurriedV1
in terms of framework.
The most important difference is that CurriedV2
introduces a generic constraint for the Curried function, so that each time it is called, the number of incoming arguments can be dynamically extracted and the type that should be returned from this call can be calculated accordingly.
As a simple test, we found that CurriedV2_Test1
cannot give the type of the Curried function straightforwardly, because the type is obtained after each step of the call and can only be determined (based on the parameters) at the time of the call.
CurriedV3: support for remaining arguments
Some functions split their arguments into two parts: fixed arguments and residual arguments. For example, toCurry
: after the first four fixed parameters, you can pass in any residual parameter of type 5
.
It would undoubtedly be better if Curried could support such functions: that is the goal of CurriedV3
. My implementation is.
|
|
The difference between CurriedV3
and CurriedV2
only is that the recursion ends under different conditions: CurriedV3
determines that P extends [unknown, . . unknown[]]
infers that P
still contains a fixed term, and continues the recursion; not satisfying this condition means that there are only remaining arguments in P
, and ends the recursion.
Thanks to the tight Drop
and the Tail
behind it – which handles both pure arrays and tuples with remaining items – the recursive part of CurriedV3
is identical to CurriedV2
.
If Drop
and Tail
don’t handle the more marginal aspects of the above well enough (e.g. returning never
or []
directly), CurriedV1
and CurriedV2
don’t suffer much, but the implementation of CurriedV3
is not so easy.
CurriedV4: Placeholder support
Placeholders in Curried can help us delay the timing of passing in parameters. For example.
This is the goal of CurriedV4
. My implementation is.
|
|
Generic Equal
The generic Equal
determines whether two types are exactly equal (note that it is still a set operation, true
and false
denote a single-element set containing a Boolean value).
Generic Item
The generic Item
extracts the possible types of array items from the array type.
The generic PlaceholderTuple
The generic PlaceholderTuple
is very similar to PartialTuple
in that it not only makes each item of the tuple optional, but also makes each item of the incoming type M
possible.
|
|
Generic Reverse
The generic Reverse
flips the head and tail of a tuple.
The generic Reverse
deserves a little expansion. Let’s look at the core part first (starting with T extends
): take the array type T
, call itself recursively, and each time recursively take the head element of T
and push it from the head into R
. When T
is exhausted, R
is naturally the flipped array.
For fixed-length tuple types, this is fine. But what if you want to flip an array type that is not of fixed length?
By simple induction in the real world, we know that Reverse<string[]>
should be string[]
and the mapping is still perfect; for Reverse<[string, . . number[]]>
, we can only map it to Array<string | number>
– as we said before, generics often return broader types than we expect, and this is unavoidable.
The first two lines of the Reverse
implementation (the non-core part) are used to handle the above two types of irregular-length arrays.
To test.
Generic Join
The generic Join
joins two tuple types “head-to-head”. Note that the first argument must be the tuple type of the fixed item.
|
|
Generic Concat
The generic Concat
concatenates two tuple types in order. Similarly, the first argument must be the tuple type of the fixed item.
Generic PlaceholderMatched
The generic PlaceholderMatched
finds the items of type M
in the tuple T
, then extracts the items of the corresponding position from the tuple S
, stores them in a new tuple R
in order, and finally returns them.
A bit of a mouthful. A quick look at the test shows exactly what PlaceholderMatched
does.
The generic type CurriedV4
Finally, the full body of the Curried function type, CurriedV4
.
The difference between CurriedV4
and CurriedV3
is in the recursive part. We use PlaceholderTuple<P, __>
to constrain the entry of the curried function so that the caller can pass in the placeholder constant __
.
In a single recursion, we extract the tuple type consisting of the PlaceholderMatched<T, P, __>
and concatenate it with the remaining arguments after this call consumes them (i.e. Drop<Length<T>, P>>
) as the new argument P
to be passed into the next recursion.
Test it out, perfect.
|
|
Summary
Although the discussion of collections in this article has focused on the first half, what prompted me to think about it was actually the practice of a few more advanced scenarios that followed. I found that it seemed clearer to apply the insights from these practices to the very first few simple generic types to make a statement.
In the original article, the base generic at the beginning is not very tight, for example, the Head
generic looks like this.
|
|
This results in Head<string[]>
returning never
, which is clearly not what it looks like from a collection perspective.
Many of the base types in the original text had edge use cases that were not handled properly, so as the problem became more complex, the generic implementation became less and less controllable. Later the original authors started introducing Cast
generic types to force back inaccurate types that were derived to the edge.
|
|
This led me to think about what these basic generic types should be implemented as. In repeated practice, I found that the code written by intuition was often not accurate enough, and for a moment, I realized that what I was missing was a set perspective; once I understood the essence of generic operations from a set perspective, I seemed to have a sense of clarity: what I could do and what I could not do, where I could compromise and where I could only give up, I could analyze them all with certainty.