PEP 646 - Variadic Generics
Before introducing this PEP, we need to make up some knowledge, so let’s go deeper and deeper to understand the Generics first.
A generic type is a feature that does not specify a specific type in advance when defining a function or class, but specifies the type at the time of use.
For dynamic languages like Python, since everything is an object reference, you can determine the type directly at the time of use.
|
|
But when it comes to type checking, there are generic issues involved, and problems can be found before they run. As an example.
In this example, the parameters can be either numbers or strings. But it is obvious that max_1("foo", 1)
and max_1(1, "foo")
throw errors when they run because the types are different. But mypy doesn’t find that.
In Python’s type system, generic type variables should use TypeVar
, and the problem is exposed.
The T
or U
in the previous 2 examples are type variables, which I understand are also type aliases that can be reused (and also easier to express complex structures), since the parameters and return values are of the same type so they are directly replaced by them. TypeVar can be bound to a type by a bound argument, and can be made to support int and str as I wrote above. I’d still recommend that you take a little look at the official website to see how TypeVar is used, as it’s really important in generics.
And such variables can also be used as elements in a container, as an example:
In common cases, combining the corresponding types by way of Union allows mypy to understand that there are multiple types of arguments and return values in a program, as in the above example items is a list of elements that are strings or numbers. Python’s built-in collection types (collections.abc.Collection) can support elements of various types because they are generic classes . And in the real world, we necessarily define various classes in our development, and sometimes we need to make custom classes support generics.
Think about the previous generic type definition. Instead of specifying a specific type in advance, the type is specified at the time of use. We define a class using typing.Generic
:
|
|
Okay, with the above set up, let’s get to the point.
As the title of the PEP says, it is about variable number of generic functions. While the TypeVar
introduced before was a single generic type, this time we introduce a generic type TypeVarTuple
with an indeterminate number.
Let’s look at an example to understand it.
|
|
In this example, the attributes key
and values
of Dict are both generic. That is, key
can also be int or str, and values is non-fixed length, and since no type is specified when using TypeVarTuple
, all types can be used. And because of the introduction of TypeVarTuple
, the flexibility of type checking can be improved a lot.
This new feature is not currently supported by mypy, and running mypy now throws an exception.
1
"TypeVarTuple" is not supported by mypy yet
PEP 673 - Self Type
Self is, as the name implies, an assertion of itself. Take 2 examples to see the common usage in the past.
|
|
There are three places in this example where the return value refers to itself by a type string:
- declaring the instance method
add_value
returns a Result instance - declare the class method
get
to return a Result instance - declaring
self.next
andself.previous
asNone
or Node instances when initializing a Node
You can’t write the class name directly here, because the class is not yet created when it is declared, which mypy understands, but it will run with a NameError: name 'XXX' is not defined
error.
In addition to using string definitions, there are three other methods, which I’ll briefly mention here without going into detail (because Python 3.11 solves this problem more perfectly).
- use
ForwardRef
for Python 3.8 and up. - import
from __future__ import annotations
(enabled by default starting with Python 3.10) - use
TResult = TypeVar ("TResult", bound="Result")
to bind to a TypeVar.
But all these methods have a very mechanical problem, which is the support and representation of the inherited classes. For example, in the string definition above, NewResult
inherits Result
and also inherits the method annotation, which means that the return value of a method like NewResult.get
is actually an instance of Result
. Of course, the logic of isinstance
is essentially fine, but it doesn’t really express self
. This PEP 673 provides Self
, which is the best solution.
|
|
This new feature is not currently supported by mypy, and running mypy now throws an exception.
PEP 675 - Arbitrary Literal String Type
Before we talk about this LiteralString
, it will be easier to understand Literal
, which was introduced in Python 3.8. Literal
means literal, and common strings, numbers, lists, dictionaries, boolean values, and so on can be used as literal values. The meaning of typing.Literal
is that it can accept literal values corresponding to the listed values.
Literal[4]
means that only the value of the argument is accepted as 4, so the second passed in 19 doesn’t work, and the third 2 + 2
results in 4, but in fact it doesn’t work either, because the so-called literal means to give me the value directly and explicitly, not by calculation. For this place, it only determines that 4
and 2+2
are not the same in the ’literal’, and it Rejected it. This paragraph is very important, understand more understanding.
Back to the point, the motivation for this PEP comes from providing a more intuitive and general solution to the SQL injection problem. Let’s look at the example provided in the PEP.
|
|
Under normal circumstances, user_id
is a conforming string, but since user_id
may be obtained from an external parameter with an unreliable source, there is a security risk that the ‘spliced SQL statement’ may be used for some additional purpose. As an example.
|
|
Here I’m using the input
function to indicate that the user_id
is being passed in from an external source. If I follow the old str declaration I don’t see the problem, but using the new LiteralString
will cause the statement to fail.
Because user_id
is not directly passed in as a string, but is calculated by input
. LiteralString
is like the title of PEP, it can represent any string literal value, not like the previous typing.Literal
, which can only specify a few definite values, which is too inflexible.
Also if the string is spliced, all parts need to be literal.
The above 2 examples will also be Rejected because the second half of user_input is not a literal value.
I think this feature is mainly for f-string, after all, input usage is rare.
PS: mypy doesn’t support this new feature yet, so it doesn’t throw errors where there are problems.
PEP 681 - Data Class Transforms
Currently type checking is well supported for all packages within the standard library, including dataclasses
, and this PEP implements a scheme to automatically transform the type checking of common classes with behaviors similar to those of the standard library dataclasses
. These behaviors include:
- the
__init__
method synthesized from the declared data fields. - optional synthesis of
__eq__
,__ne__
,__lt__
,__le__
,__gt__
and__ge__
methods. - support frozen parameter, static type checking will confirm the immutability of the field.
- Support [field descriptor], static type checking will understand the properties of each field, such as whether the field provides a default value.
Before the implementation of this PEP, when you use related libraries in your project, such as attrs, pydantic, various ORMs (e.g. SQLAlchemy, Django, etc.), then these libraries need to provide corresponding type annotations during static type checking, otherwise you have to write them yourself or find a way to ignore the related checking. The PEP is designed to reduce this cost by making it easy to support type checking at the decorator, class, and metaclass levels without writing additional annotations through dataclass_transform
.
Personally, I think this PEP is mainly intended to help library authors, unless they build their own wheels in the project with behaviors similar to those of the dataclasses
library, so it has less impact on developers to begin with.
As an example, it may be better understood. Personally, I prefer to use attrs
, which is how I define Model in my project (greatly simplified for the sake of example).
I didn’t define the __init__
method, but when I use attrs
, it helps me to create a series of corresponding methods automatically. When using Model(1, 2)
it should not pass the type check (because the title should be a string and I passed in int)
Then install a version of attrs that doesn’t support this feature and run pyright (another static checking tool, mypy doesn’t support this PEP yet) to try it out.
|
|
pyright is particularly silly in that it thinks that the constructor __init__
is not defined in this class. At this point attrs
does not yet support the corresponding type annotations. For those interested, see the corresponding PR: Implement pyright support via dataclass_transforms
Later, pyright will understand the above usage of attrs:
|
|
You can see that this error above is correct.
PEP 655 - Marking individual TypedDict items as required or potentially-missing
TypedDict
is a very useful type added as of Python 3.8, so let’s get it out of the way. It’s common to define a complex dictionary type in everyday development, and if you want mypy to do validation on this dictionary key-value type, you’ll probably need this.
I’ll try to be as specific as possible in the above example, but since this dictionary has so many types of values, I have to use the Union method to string them together. But it is not clear enough in mypy. For example, this logic is executed.
mypy will throw an error:
|
|
So it is often impossible to specify the type of the return value. And TypedDict
is the solution to this problem.
|
|
TypedDict
specifies the type of each key value in a form similar to dataclass
. This helps mypy to better understand the structure of the returned values, and thus to determine more type issues in the logic.
But until Python 3.11, it was implemented with extreme requirements for defined keys, either they all needed to be present, or it didn’t care which key was missing.
|
|
In the past, in order to separate whether different keys are an Optional or not, the only way is to use inheritance.
PEP 655 makes the definition of a key in a TypedDict explicitly dependent or not by introducing Required[]
and NotRequired[]
.
|
|
Except for PEP 655, mypy does not support it yet.