In this article, we will explain how to implement a zero-overhead async trait in Rust using GAT, using a series of RocksDB-like iterators as an example. The code in this article requires nightly Rust to compile.
We will implement two iterators.
- TestIterator: Produces an ordered sequence of KVs.
- ConcatIterator: stitch together sequences of multiple iterators.
Example: TestIterator::new(0, 5)
will produce the following sequence.
1
2
3
4
5
|
key_00000 -> value_00000
key_00001 -> value_00001
key_00002 -> value_00002
key_00003 -> value_00003
key_00004 -> value_00004
|
ConcatIterator::new(vec![TestIterator::new(0, 5), TestIterator::new(5, 7)])
will generate:
1
2
3
4
5
6
7
|
key_00000 -> value_00000
key_00001 -> value_00001
key_00002 -> value_00002
key_00003 -> value_00003
key_00004 -> value_00004
key_00005 -> value_00005
key_00006 -> value_00006
|
Define async trait
KvIterator
is a trait that will be given to all iterator implementations. the user can call .next()
to move the iterator to the next position.
1
2
3
4
|
pub trait KvIterator {
/// Get the next item from the iterator.
async fn next(&mut self) -> Option<(&[u8], &[u8])>;
}
|
Executing the compilation will give an error.
1
2
3
4
5
6
7
8
9
10
|
error[E0706]: functions in traits cannot be declared `async`
--> src/kv_iterator.rs:5:5
|
5 | async fn next(&mut self) -> Option<(&[u8], &[u8])>;
| -----^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| `async` because of this
|
= note: `async` trait functions are not currently supported
= note: consider using the `async-trait` crate: https://crates.io/crates/async-trait
|
The Rust compiler does not support the async
trait function by default, and the compiler prompts for the async-trait
crate, which unfortunately is not zero overhead. After using async-trait
, the trait will be rewritten in the following form.
1
2
3
4
5
6
7
8
9
10
11
12
|
#[async_trait]
pub trait KvIterator {
/// Get the next item from the iterator.
async fn next(&mut self) -> Option<(&[u8], &[u8])>;
}
/// ... will be rewritten to
pub trait KvIterator {
/// Get the next item from the iterator.
fn next(&mut self) -> Box<dyn Future<Output = Option<(&[u8], &[u8])>>>;
}
|
Here there are two layers of overhead.
- The overhead of dynamic scheduling
dyn Future
. This means that the next function of all iterators is more difficult to do some compiler optimization.
- Memory allocation overhead
Box
. In KV storage, next
should be a function that is called often. trait has been rewritten in a new form by async-trait, and each call to .next
requires a new object on the heap. This can have a significant impact on the performance of the program.
How can we implement async trait with zero overhead? That’s where GAT comes in.
Writing async trait with GAT
The compiler does not support async traits, essentially because the .next
function of the impl KvIterator
returns a different type of Future
. This problem can be solved simply by using associated type.
1
2
3
4
5
6
|
pub trait KvIterator {
type NextFuture: Future<Output = Option<(&[u8], &[u8])>>;
/// Get the next item from the iterator.
fn next(&mut self) -> Self::NextFuture;
}
|
This introduces a problem: &'lifetime [u8]
needs to have a lifecycle, how should this lifecycle be written? Rationally, &'lifetime
is the same as &mut self
lifecycle of next
, so it should be a generic of NextFuture
itself. How do you express this in Rust? Obviously this requires generic associated type. With the GAT compile option turned on.
1
2
3
4
5
6
|
pub trait KvIterator {
type NextFuture<'a>: Future<Output = Option<(&'a [u8], &'a [u8])>>;
/// Get the next item from the iterator.
fn next(&mut self) -> Self::NextFuture<'_>;
}
|
The compiler reported another error.
1
2
3
4
5
6
7
8
9
10
|
error: missing required bound on `NextFuture`
--> src/kv_iterator.rs:4:5
|
4 | type NextFuture<'a>: Future<Output = Option<(&'a [u8], &'a [u8])>>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-
| |
| help: add the required where clause: `where Self: 'a`
|
= note: this bound is currently required to ensure that impls have maximum flexibility
= note: we are soliciting feedback, see issue #87479 <https://github.com/rust-lang/rust/issues/87479> for more information
|
Why?
NextFuture
is returned by the next
function, whereas a normal implementation of the next
function can only return a reference with the same life cycle as &mut self
. Rust’s trait can be implemented on a reference (e.g., impl <'a> Iterator for &'a mut SomeIterator
). If Self
(in the above example, &'a mut SomeIterator
) itself has a shorter lifetime than this reference, it would not be possible to return such a NextFuture
.
So here, we need to add a where Self: 'a
to make sure that Self
has a lifespan at least as long as 'a
of NextFuture
.
In older versions of the Rust compiler, this is not reported as an error without Self: 'a
, but in some odd places. It’s a good thing that the compiler pointed this out directly here.
1
2
3
4
5
6
7
8
|
pub trait KvIterator {
type NextFuture<'a>: Future<Output = Option<(&'a [u8], &'a [u8])>>
where
Self: 'a;
/// Get the next item from the iterator.
fn next(&mut self) -> Self::NextFuture<'_>;
}
|
That’ll get it to compile!
Implementing TestIterator
First, quickly write the framework for TestIterator
.
1
2
3
4
5
6
7
8
9
10
|
pub struct TestIterator {
idx: usize
}
impl KvIterator for TestIterator {
type NextFuture = /* */;
fn next(&mut self) -> Self::NextFuture<'_> {
}
}
|
Two problems have been encountered here.
- How should I write the logic inside
next
? next
returns a Future, not the common async fn
.
- The answer is simple: use
async move
to return a closure.
- Since
next
returns a function, how do you write the type of NextFuture
? As we all know, Rust functions can’t be written with types.
- Here we have to turn on a feature that lets the compiler automatically derive the specific type of
Future
.
1
2
3
4
5
6
7
8
9
10
11
12
13
|
#![feature(generic_associated_types)]
#![feature(type_alias_impl_trait)]
pub struct TestIterator {
idx: usize,
}
impl KvIterator for TestIterator {
type NextFuture<'a> = impl Future<Output = Option<(&'a [u8], &'a [u8])>>;
fn next(&mut self) -> Self::NextFuture<'_> {
async move { Some((b"key".as_slice(), b"value".as_slice())) }
}
}
|
This way, it will pass compilation. Implement the logic inside TestIterator
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
|
pub struct TestIterator {
idx: usize,
to_idx: usize,
key: Vec<u8>,
value: Vec<u8>,
}
impl TestIterator {
pub fn new(from_idx: usize, to_idx: usize) -> Self {
Self {
idx: from_idx,
to_idx,
key: Vec::new(),
value: Vec::new(),
}
}
}
impl KvIterator for TestIterator {
type NextFuture<'a>
where
Self: 'a,
= impl Future<Output = Option<(&'a [u8], &'a [u8])>>;
fn next(&mut self) -> Self::NextFuture<'_> {
async move {
if self.idx >= self.to_idx {
return None;
}
// Zero-allocation key value manipulation
self.key.clear();
write!(&mut self.key, "key_{:05}", self.idx).unwrap();
self.value.clear();
write!(&mut self.value, "value_{:05}", self.idx).unwrap();
self.idx += 1;
Some((&self.key[..], &self.value[..]))
}
}
}
|
To test that the TestIterator is working properly.
1
2
3
4
5
6
7
8
9
10
11
|
#[tokio::test]
async fn test_iterator() {
let mut iter = TestIterator::new(0, 10);
while let Some((key, value)) = iter.next().await {
println!(
"{:?} {:?}",
Bytes::copy_from_slice(key),
Bytes::copy_from_slice(value)
);
}
}
|
Run a test, as expected, success!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
running 1 test
b"key_00000" b"value_00000"
b"key_00001" b"value_00001"
b"key_00002" b"value_00002"
b"key_00003" b"value_00003"
b"key_00004" b"value_00004"
b"key_00005" b"value_00005"
b"key_00006" b"value_00006"
b"key_00007" b"value_00007"
b"key_00008" b"value_00008"
b"key_00009" b"value_00009"
test test_iterator::tests::test_iterator ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
|
Implementing ConcatIterator
The logic of ConcatIterator
is also very simple: record which iterator is being accessed now, and just return the contents of the child iterator directly.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
pub struct ConcatIterator<Iter: KvIterator> {
iters: Vec<Iter>,
current_idx: usize,
}
impl<Iter: KvIterator> ConcatIterator<Iter> {
pub fn new(iters: Vec<Iter>) -> Self {
Self {
iters,
current_idx: 0,
}
}
}
impl<Iter: KvIterator> KvIterator for ConcatIterator<Iter> {
type NextFuture<'a>
where
Self: 'a,
= impl Future<Output = Option<(&'a [u8], &'a [u8])>>;
fn next(&mut self) -> Self::NextFuture<'_> {
async move {
loop {
if self.current_idx >= self.iters.len() {
return None;
}
let iter = &mut self.iters[self.current_idx];
match iter.next().await {
Some(item) => {
return Some(item);
}
None => {
self.current_idx += 1;
}
}
}
}
}
}
|
However, it’s not that simple. The compiler reported an error.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
error[E0502]: cannot borrow `self.iters` as immutable because it is also borrowed as mutable
--> src/concat_iterator.rs:28:40
|
28 | if self.current_idx >= self.iters.len() {
| ^^^^^^^^^^^^^^^^ immutable borrow occurs here
...
31 | let iter = &mut self.iters[self.current_idx];
| ---------- mutable borrow occurs here
32 | match iter.next().await {
33 | Some(item) => return Some(item),
| ---------- returning this value requires that `self.iters` is borrowed for `'1`
...
37 | }
| - return type of async block is Option<(&'1 [u8], &[u8])>
error[E0499]: cannot borrow `self.iters` as mutable more than once at a time
--> src/concat_iterator.rs:31:33
|
31 | let iter = &mut self.iters[self.current_idx];
| ^^^^^^^^^^ `self.iters` was mutably borrowed here in the previous iteration of the loop
32 | match iter.next().await {
33 | Some(item) => return Some(item),
| ---------- returning this value requires that `self.iters` is borrowed for `'1`
...
37 | }
| - return type of async block is Option<(&'1 [u8], &[u8])>
|
What’s going on here? Unfortunately, this is a flaw in Rust’s current borrow checker NLL. Even if the code makes logical sense, the borrow checker doesn’t think it does.
What can we do about this? Let’s try to avoid it in two ways.
Option 1: Change the Borrow Checker
Polonius is a brand new borrow checker. Enable it directly with the flag.
1
|
RUSTFLAGS="-Z polonius" cargo build
|
The compilation passes.
Polonius uses a different algorithm than the current Rust borrow checker NLL, and can handle more cases where a race condition does not actually occur, but cannot currently be compiled. The Rust program that Polonius can compile is a superset of NLL.
Option 2: Storing the key value inside the structure
We cache the kv pair returned by the lower level iterator inside ConcatIterator
, which also passes compilation. Unfortunately, this gives us an extra copy of .next()
, which is a bit less “zero overhead”.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
pub struct ConcatIterator<Iter: KvIterator> {
iters: Vec<Iter>,
key: Vec<u8>,
value: Vec<u8>,
current_idx: usize,
}
impl<Iter: KvIterator> ConcatIterator<Iter> {
pub fn new(iters: Vec<Iter>) -> Self {
Self {
iters,
current_idx: 0,
key: Vec::new(),
value: Vec::new(),
}
}
}
impl<Iter: KvIterator> KvIterator for ConcatIterator<Iter> {
type NextFuture<'a>
where
Self: 'a,
= impl Future<Output = Option<(&'a [u8], &'a [u8])>>;
fn next(&mut self) -> Self::NextFuture<'_> {
async move {
loop {
if self.current_idx >= self.iters.len() {
return None;
}
let iter = &mut self.iters[self.current_idx];
match iter.next().await {
Some((key, value)) => {
self.key.clear();
self.key.extend_from_slice(key);
self.value.clear();
self.value.extend_from_slice(value);
break Some((self.key.as_slice(), self.value.as_slice()));
}
None => {
self.current_idx += 1;
}
}
}
}
}
}
|
Option 3: Refactor KvIterator
trait
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
pub trait KvIterator {
/// The return type of `next`.
type KvIteratorNextFuture<'a>: Future<Output = ()>
where
Self: 'a;
/// Move the iterator to the position of the next key.
fn next(&mut self) -> Self::KvIteratorNextFuture<'_>;
/// Get the current key.
fn key(&self) -> &[u8];
/// Get the current value.
fn value(&self) -> &[u8];
/// Check if the iterator is exhausted.
fn is_valid(&self) -> bool;
}
|
We don’t use the Rust-style iterator implementation: .next
moves only the iterator position; key
, value
returns the content; is_valid
checks if we’re at the end. This also bypasses the lifecycle issue.
For simplicity of implementation, let’s run a unit test using option 2.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
#[tokio::test]
async fn test_iterator() {
let iter1 = TestIterator::new(0, 5);
let iter2 = TestIterator::new(5, 10);
let mut concat_iter = ConcatIterator::new(vec![iter1, iter2]);
while let Some((key, value)) = concat_iter.next().await {
println!(
"{:?} {:?}",
Bytes::copy_from_slice(key),
Bytes::copy_from_slice(value)
);
}
}
|
The result is correct!
1
2
3
4
5
6
7
8
9
10
11
12
|
running 1 test
b"key_00000" b"value_00000"
b"key_00001" b"value_00001"
b"key_00002" b"value_00002"
b"key_00003" b"value_00003"
b"key_00004" b"value_00004"
b"key_00005" b"value_00005"
b"key_00006" b"value_00006"
b"key_00007" b"value_00007"
b"key_00008" b"value_00008"
b"key_00009" b"value_00009"
test concat_iterator::tests::test_iterator ... ok
|
Summary
Using the generic_associated_types and type_alias_impl_trait traits, we can easily implement the async trait manually and avoid the memory allocation and dynamic scheduling overhead of the async-trait
crate. However, there are several problems with this.
- requires nightly Rust
- Cannot be recursive (consider async-recursion crate)
- can’t be made dyn type directly (can be done manually with type gymnastics trick)