Among Rust’s design goals, zero-cost abstraction is an important one, giving Rust the power of high-level language expression without the performance penalty. The cornerstones of zero-cost abstraction are generics and traits, which compile high-level syntax into efficient underlying code at compile time, enabling efficient runtime. This article introduces trait, including how it is used and an analysis of three common problems, to illustrate the principles of its implementation in the process of problem solving.
Usage
Basic Usage
The main purpose of a trait is to abstract behavior, similar to “interfaces” in other programming languages. Here is an example to illustrate the basic use of trait.
In the above code, a trait Greeting is defined and two structs implement it. Depending on how the function is called, there are two main ways to use it.
- Static dispatch based on generic
- Dynamic dispatching based on trait object
The concept of generics is relatively common, so here we focus on trait object.
A trait object is an opaque value of another type that implements a set of traits. The set of traits is made up of an object safe base trait plus any number of auto traits.
The more important point is that the trait object belongs to Dynamically Sized Types (DST), which cannot be sized at compile time. It can only be accessed indirectly through a pointer, commonly in the form of Box<dyn trait>
, &dyn trait
, etc.
|
|
Static Derivation
In Rust, the implementation of generics uses monomorphization, which generates different versions of functions at compile time for different types of callers, so generics are also known as type parameters. The advantage is that there is no overhead of virtual function calls, and the disadvantage is that the final binary is bloated. In the above example, print_greeting_static
would compile to both of the following versions.
Dynamic dispatching
Not all function calls can determine the caller type at compile time, a common scenario is the callback of event response in GUI programming, in general an event may correspond to more than one callback function, and these callback functions are not determined at compile time, so the generic type is not applicable here, and dynamic dispatching is needed.
impl trait
In Rust version 1.26, a new use of trait was introduced, namely impl trait, which can be used in two places: function arguments and return values. This approach is mainly to simplify the use of complex traits, and is considered a special case version of generics, because where impl trait is used, it is also statically derived, and when used as a function return value, there can only be one data type, which should be paid special attention to!
|
|
Higher-order usage
Associated Types
In the basic usage described above, the types of the arguments or return values of the methods in a trait are determined. Rust provides a mechanism for “inert binding” of types, called associated type, so that the types can be determined when the trait is implemented, a common example being the standard library’s Iterator, where the return value of next is Self::Item
.
|
|
Similar to the use of association types and generics, Iterators can also be defined using generics.
They differ mainly in that
- A specific type (like Cat above) can implement a generic trait multiple times. e.g. for
From<T>
, there can beimpl From<&str> for Cat
andimpl From<String> for Cat
. - But for traits of associated types, you can only implement them once. For example, for FromStr, you can only have
impl FromStr for Cat
, and similar traits areIterator
,Deref
Derive
In Rust, you can use the derive property to implement some common traits, such as: Debug/Clone, etc. For user-defined traits you can also implement procedure macros to support derive, for details see. How to write a custom derive macro?, which is not repeated here.
Frequently Asked Questions
Upcast
For trait SubTrait: Base
, it is not possible to convert &dyn SubTrait
to &dyn Base
in the current version of Rust. This limitation is related to the memory structure of the trait object.
In Exploring Rust fat pointers, the author transmutes the reference to the trait object into two usize and verifies that they are pointers to data and function virtual tables.
|
|
As you can see here, Rust uses fat pointer to represent references to the trait object, pointing to data and vtable, much like interface in Go.
Although fat pointer results in a larger pointer size (not being able to use instructions like Atomic), the benefits are more obvious: 1.
-
traits can be implemented for existing types (e.g. blanket implementations)
-
when calling a function in a virtual table, it needs to be referenced only once, whereas in C++, the vtable exists inside the object, resulting in two references for each function call, as shown in the following figure.
How does a vtable store the methods of different traits if the traits have inheritance relationships? In the current implementation, they are stored sequentially in a vtable, as follows.
As you can see, all the trait methods are put together in order, and there is no distinction between which trait the method belongs to, which also leads to the inability to upcast, there is RFC 2765 in the community to track this problem, interested readers can refer to, here we will not discuss the solution, introduce a more general solution, by introducing an AsBase trait to solve.
|
|
Downcast
A downcast is a trait object that is then converted to a previous concrete type, and Rust provides the Any trait to do this.
Most types implement Any, only those that contain non-static references do not. The type can be determined at runtime by using type_id, as shown in the following example.
|
|
The above code focuses on downcast_ref, which is implemented as
You can see that the first pointer to the trait object reference (i.e., the data pointer) is converted to a reference to a specific type by unsafe code when the type is consistent.
Object safety
In Rust, not all traits can be used as trait objects; they need to satisfy certain conditions, called object safety attribute. The main points are as follows.
-
the return type of the function cannot be Self (i.e., the current type). This is mainly because after converting an object to a trait object, the original type information is lost, so the Self here is not determined.
-
functions are not allowed to have generic parameters. The main reason is that monomorphism generates a large number of functions, which can easily lead to method expansion within the trait. For example
-
Traits cannot inherit from Sized, because Rust implements the trait for the trait object by default, generating code like the following.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
trait Foo { fn method1(&self); fn method2(&mut self, x: i32, y: String) -> usize; } // autogenerated impl impl Foo for TraitObject { fn method1(&self) { // `self` is an `&Foo` trait object. // load the right function pointer and call it with the opaque data pointer (self.vtable.method1)(self.data) } fn method2(&mut self, x: i32, y: String) -> usize { // `self` is an `&mut Foo` trait object // as above, passing along the other arguments (self.vtable.method2)(self.data, x, y) } }
If Foo inherits Sized, then it requires that the trait object is also Sized, and the trait object is of type DST, which is
?Sized
, so the trait cannot inherit Sized.For traits that are not safe, it is best to modify them to be safe, but if not, you can try a generic approach.
Summary
This article began with the introduction of trait as the basis for zero-cost abstraction, the ability to add new methods to existing types through trait, which actually solves the expression problem, allows operator overloading, allows interface-oriented programming, etc. . We hope that the analysis in this article will allow readers to better navigate the use of trait and to be comfortable in the face of compiler errors.