Pin and Unpin in Rust

On February 28, 2019, Rust version 1.33 was released, adding new pinning APIs, mainly including these.

std::pin::Pin
std::marker::Unpin
std::marker::PhantomPinned
impl !Unpin for T

When I first encountered these concepts, I felt that they were particularly convoluted and difficult to understand thoroughly. There are also many articles on the Internet about Pin and Unpin, but I feel that they are not particularly clear. It’s also difficult to understand the std::pin module documentation directly. So in this article I will try to sort out Pin and Unpin from shallow to deep, and hopefully it will help you understand these concepts.

Concept

Let’s look at the definition on the official documentation of Pin.

/// A pinned pointer.
///
/// This is a wrapper around a kind of pointer which makes that pointer "pin" its
/// value in place, preventing the value referenced by that pointer from being moved
/// unless it implements [`Unpin`].
#[stable(feature = "pin", since = "1.33.0")]
#[lang = "pin"]
#[fundamental]
#[repr(transparent)]
#[derive(Copy, Clone)]
pub struct Pin<P> {
    pointer: P,
}

#[stable(feature = "pin", since = "1.33.0")]
impl<P: Deref> Deref for Pin<P> {
    type Target = P::Target;
    fn deref(&self) -> &P::Target {
        Pin::get_ref(Pin::as_ref(self))
    }
}

#[stable(feature = "pin", since = "1.33.0")]
impl<P: DerefMut<Target: Unpin>> DerefMut for Pin<P> {
    fn deref_mut(&mut self) -> &mut P::Target {
        Pin::get_mut(Pin::as_mut(self))
    }
}

I’ll start by unpacking it on a macro level. Pin is one such smart pointer that wraps another pointer P inside him and guarantees that T will never be moved (moved) as long as the content pointed by the P pointer (which we call T) does not implement Unpin. The word Pin also figuratively means that Pin is like a nail that can hold T in place. So Pin is generally represented by Pin<P<T>> in this way (P is short for Pointer and T is short for Type). This definition is a bit confusing at first glance, so let’s underline a few points.

Pin itself is a smart pointer. Why? Because he impl has Deref and DerefMut.
Pin wrapped content can only be a pointer, not other common types. For example, Pin<u32> would not make sense.
Pin has the function of “pinning” T from moving, whether this function works depends on whether T impl Unpin or not. Simply put, if T implements Unpin, the “pinning” function of Pin is completely disabled, and the Pin<P<T>> is then equivalent to P<T>.
Unpin is an auto trait, and the compiler implements Unpin for all types by default. There are only a few exceptions, and they are implemented as !Unpin. These exceptions are PhantomPinned, the structure of impl Future that the compiler generates for async/await desugar afterwards.
So Pin<P<T>> does not work by default, but only for the impl !Unpin cases mentioned above.

After reading these articles may still be a little confused, it does not matter, we start to analyze each article. The first thing we need to figure out is what exactly is a move and why we need to prevent it from happening in some cases.

What exactly is a move?

According to the official definition: the process of transferring ownership is move, so let’s look at a familiar example.

fn main() {
  let mut s1 = String::from("Hello");
  let s2 = s1; // The ownership of s1 is transferred to s2, and here the move occurs
  // let s3 = s1; // The ownership of s1 and the transfer away, can not be moved again, otherwise it will report an error: error[E0382]: use of moved value: `s1`
}

The basic concept is not explained here, what we need to figure out is let s2 = s1; what is happening in this line.

rust move

Borrowing from this diagram in the TRPL book, both variables s1 and s2 are allocated on the stack, and the string “Hello” is allocated on the heap, where the ptr field is a pointer to the string. move occurs when the compiler creates a new piece of memory on the stack, s2, and then copies the contents of s1 to s2 as is. and then copies the contents of the s1 stack to s2, immediately invalidating the original s1 memory.

Look at another example of a move.

fn main() {
    let mut x = String::from("xxx");
    let mut y = String::from("yyy");

    std::mem::swap(&mut x, &mut y);

    assert_eq!("yyy", &x);
    assert_eq!("xxx", &y);
}

We swap the contents of two mutable borrowings &mut via the std::mem::swap() method, where a move also occurs.

These two kinds of moves are very common in Rust and do not cause any problems at all. So what exactly is the situation that needs to prevent a move from happening?

There really is, and that is self-referential structures!

Move of Self-Referential Structs

Self-Referential Structsis a structure like this, where one of its internal members is a reference to another member. For example, like this.

struct Test<'a> {
    a: String,
    b: &'a String,
}

fn main() {
    let a = String::from("Hello");
    let _test = Test { a, b: &a };
//    let _test = Test { a, b: &a };
//  |                        -     ^^ value borrowed here after move
//  |                        |
//  |                        value moved here
}

But the fact is that there is no way to construct a self-referential structure like Test through Safe Rust, and Rust’s support for self-referential structures is still very imperfect. The only workaround is to use pointers.

#[derive(Debug)]
struct Test {
    a: String,
    b: *const String, // Change to pointer
}

impl Test {
    fn new(txt: &str) -> Self {
        Test {
            a: String::from(txt),
            b: std::ptr::null(),
        }
    }

    fn init(&mut self) {
        let self_ref: *const String = &self.a;
        self.b = self_ref;
    }

    fn a(&self) -> &str {
        &self.a
    }

    fn b(&self) -> &String {
        unsafe {&*(self.b)}
    }
}

Let’s try the move of this self-referencing structure.

fn main() {
    let mut test1 = Test::new("test1");
    test1.init();
    let mut test2 = Test::new("test2");
    test2.init();

    println!("a: {}, b: {}", test1.a(), test1.b());
    // Use the swap() function to swap the two, where the move occurs
    std::mem::swap(&mut test1, &mut test2);
    test1.a = "I've totally changed now!".to_string();
    println!("a: {}, b: {}", test2.a(), test2.b());
}

Here is the printout.

1
2

a: test1, b: test1
a: test1, b: I've totally changed now!

Have you noticed that something is wrong! What’s the problem? The reason is that field b in the Test structure is a pointer to field a, which stores the address of field a on the stack. After swapping the two Test structures by the swap() function, fields a and b are moved to each other’s memory area, but the contents of a and b themselves remain unchanged. That is, the pointer b still points to the original address, but this address now belongs to another structure! Not only is this not a self-referential structure anymore, but what’s even scarier is that this pointer could lead to even more dangerous problems, which Rust will never allow! 👇 The following diagram can help to understand.

Move of Self-Referential Structs

More critically, Rust’s Generator and async/await set are based on self-referential structs. If this problem is not solved at the root, the foundation of Rust’s claim to be Memory Safe is completely shaken.

For more on the principles of async/await, it is highly recommended to read these two books.

So let’s find the root cause of this problem and figure out how to fix it at the source!

What is the root cause?

We found that the most critical line of code in the above example is std::mem::swap(&mut test1, &mut test2), which is what caused our self-referenced structure to fail and caused the memory safety problem. So don’t we just need to avoid applying this swap() function to our self-referenced structures? But how do we avoid it? Let’s look at the definition of the swap() method.

#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn swap<T>(x: &mut T, y: &mut T) {
    // SAFETY: the raw pointers have been created from safe mutable references satisfying all the
    // constraints on `ptr::swap_nonoverlapping_one`
    unsafe {
        ptr::swap_nonoverlapping_one(x, y);
    }
}

Its parameters require variable borrowing &mut, so as long as we find a way to keep variable borrowing from being exposed under Safe Rust!

Or take Test as an example, it has no way to restrict itself from variable borrowing, because we can easily get it directly with &mut Test{...} and we can easily get it. What about going to the standard library, Box<T>? Ignoring its performance issues for now, let’s wrap the structure T in Box and see if Box can guarantee not to expose &mut T out. Looking at the API documentation, unfortunately we can’t. The return value of Box::leak() is &mut T, and what’s more Box impl has DerefMut, even if we don’t use leak() we can also easily get &mut T by * &mut Box<T> dereference!

No need to look, there is really no such API in the standard library before Pin that prevents from not exposing &mut T under Safe Rust.

So, it’s time for Pin to make an appearance!

We found out where the root of the problem is, and Pin is the solution to that problem from the root. Now that we’re clear, doesn’t it seem that we can sum it up in one sentence: Pin is a smart pointer that won’t let you expose variable borrowing &mut at Safe Rust?

The answer is: not entirely true. This is where the Pin concept initially left everyone flabbergasted. Let Pin himself answer everyone’s confusion. Pin says: “Don’t you want me to make sure that the pointer P<T> wrapped by me is always pinned and not allowed to move? I can promise, but I have a principle. That is that I can never pin a friend who holds a pass, and that pass is Unpin. If you don’t have this pass, rest assured that I will nail you to death!”

As an example. Let’s say I am Pin and you are P<T>, if you impl have Unpin, I will provide two ways for you to get &mut T under Safe Rust.

The first one, using: Pin::get_mut()

impl<'a, T: ?Sized> Pin<&'a mut T> {
    #[stable(feature = "pin", since = "1.33.0")]
    #[inline(always)]
    pub fn get_mut(self) -> &'a mut T where T: Unpin {
        self.pointer
    }
}

The second one, I impl the DerefMut, you can dereference to get &mut T

#[stable(feature = "pin", since = "1.33.0")]
impl<P: DerefMut<Target: Unpin>> DerefMut for Pin<P> {
    fn deref_mut(&mut self) -> &mut P::Target {
        Pin::get_mut(Pin::as_mut(self))
    }
}

But rustc dad is too lenient, he defaulted to all your types issued a pass (all achieved Unpin )! I’m almost out of a job!

#[lang = "unpin"]
pub auto trait Unpin {}

#[stable(feature = "pin", since = "1.33.0")]
impl<'a, T: ?Sized + 'a> Unpin for &'a T {}

#[stable(feature = "pin", since = "1.33.0")]
impl<'a, T: ?Sized + 'a> Unpin for &'a mut T {}

#[stable(feature = "pin_raw", since = "1.38.0")]
impl<T: ?Sized> Unpin for *const T {}

#[stable(feature = "pin_raw", since = "1.38.0")]
impl<T: ?Sized> Unpin for *mut T {}

The only thing I’m glad about is that he left me a little buddy named PhantomPinned. Don’t look at his strange name, but he’s my very favorite right-hand man! Because he achieves it !Unpin!

#[stable(feature = "pin", since = "1.33.0")]
#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub struct PhantomPinned;

#[stable(feature = "pin", since = "1.33.0")]
impl !Unpin for PhantomPinned {}

Papa rustc also said that if you want to “go straight” and get rid of Unpin, you can do so in two ways.

Use PhantomPinned . With it, rustc daddy won’t let you implement Unpin

use std::marker::PhantomPinned;

#[derive(Debug)]
struct Test {
a: String,
b: *const String,
_marker: PhantomPinned,
}

Give yourself a manual impl !Unpin. The prerequisite is that you use the nightly version and that you need to introduce #! [feature(negative_impls)]

#![feature(negative_impls)]
#[derive(Debug)]
struct Test {
    a: String,
    b: *const String,
}

impl !Unpin for Test {}

If you meet any one of the above two conditions, I’ll make sure you can’t get the variable borrow &mut T under Safe Rust (go through my API if you don’t believe me), and if you can’t get &mut T you can’t act on std::mem::swap(), which means you’re nailed by me! Do you think rustc daddy has enchanted me? You’re wrong, I work just like that! Thanks to the rich and powerful type system of the Rust world, so do my other brothers Sync, Send, none of us have any so-called magic!

Of course I still provide an unsafe get_unchecked_mut(), whether you have implemented Unpin or not, you can get &mut T by calling this method, but you need to abide by the Pin’s contract (refer below), otherwise you are responsible for the consequences of what goes wrong!

impl<'a, T: ?Sized> Pin<&'a mut T> {
    #[stable(feature = "pin", since = "1.33.0")]
    #[inline(always)]
    pub unsafe fn get_unchecked_mut(self) -> &'a mut T {
        self.pointer
    }
}

Pin’s contract

For Pin<P<T>>,

If P<T> matches Unpin, then P<T> has to keep P<T> unpinned from the time it is wrapped by Pin to the time it is destroyed
If P<T> is !Unpin, then P<T> is guaranteed to be pinned from the time it is wrapped by Pin until it is destroyed

With the above Pin self-reference, let’s summarize in another sentence: If you implement Unpin, Pin allows you to get &mut T under Safe Rust, otherwise it will pin you under Safe Rust (i.e., you can’t get &mut T).

Next we use Pin to fix the problem with the self-referencing structs above.

How to construct a Pin

First we need to sort out how to wrap P<T> in Pin, that is, how to construct a Pin. A look at the documentation will show that there are several main ways to do this.

Pin::new()

impl<P: Deref<Target: Unpin>> Pin<P> {
    #[stable(feature = "pin", since = "1.33.0")]
    #[inline(always)]
    pub fn new(pointer: P) -> Pin<P> {
        // Safety: the value pointed to is `Unpin`, and so has no requirements
        // around pinning.
        unsafe { Pin::new_unchecked(pointer) }
    }
}

You can safely call Pin::new() to construct a Pin if the T that your P points to is Unpin. You can see that it is actually calling unsafe Pin::new_unchecked() at the bottom. The reason why Pin::new() is safe is that the “pinned” effect of Pin does not work in the case of Unpin, and it is the same as a normal pointer.

Pin::new_unchecked()

impl<P: Deref> Pin<P> {
    #[stable(feature = "pin", since = "1.33.0")]
    #[inline(always)]
    pub unsafe fn new_unchecked(pointer: P) -> Pin<P> {
        Pin { pointer }
    }
}

This method is simple, but it is unsafe. The reason it is marked as unsafe is that the compiler has no way to guarantee that the user’s subsequent operations must comply with the Pin contract . As long as there is a possibility of violating the contract, it must be marked as unsafe, because it is the user’s problem and the compiler has no way to guarantee it. If the user constructs a Pin<P<T>> with Pin::new_unchecked() and then the life cycle of Pin ends, but P<T> still exists, the subsequent operation may still be moved, causing memory insecurity.

use std::mem;
use std::pin::Pin;

fn move_pinned_ref<T>(mut a: T, mut b: T) {
    unsafe {
        let p: Pin<&mut T> = Pin::new_unchecked(&mut a);
        // This should mean the pointee `a` can never move again.
    }
    mem::swap(&mut a, &mut b);
    // The address of `a` changed to `b`'s stack slot, so `a` got moved even
    // though we have previously pinned it! We have violated the pinning API contract.
}

Other

including Box::pin(), Rc::pin() and Arc::pin(), etc., the underlying are calls to the above Pin::new_unchecked(), no further elaboration.

Application of Pin

Pin can be classified as on the stack or on the heap, depending on whether that pointer P you want to Pin is on the stack or on the heap. For example, Pin<&mut T> is on the stack and Pin<Box<T>> is on the heap.

Pin to the stack

use std::pin::Pin;
use std::marker::PhantomPinned;

#[derive(Debug)]
struct Test {
    a: String,
    b: *const String,
    _marker: PhantomPinned,
}

impl Test {
    fn new(txt: &str) -> Self {
        Test {
            a: String::from(txt),
            b: std::ptr::null(),
            _marker: PhantomPinned, // This makes our type `!Unpin`
        }
    }

    fn init<'a>(self: Pin<&'a mut Self>) {
        let self_ptr: *const String = &self.a;
        let this = unsafe { self.get_unchecked_mut() };
        this.b = self_ptr;
    }

    fn a<'a>(self: Pin<&'a Self>) -> &'a str {
        &self.get_ref().a
    }

    fn b<'a>(self: Pin<&'a Self>) -> &'a String {
        unsafe { &*(self.b) }
    }
}

pub fn main() {
    let mut test1 = unsafe { Pin::new_unchecked(&mut Test::new("test1")) };
    Test::init(test1.as_mut());

    let mut test2 = unsafe { Pin::new_unchecked(&mut Test::new("test2")) };
    Test::init(test2.as_mut());

    println!("a: {}, b: {}", Test::a(test1.as_ref()), Test::b(test1.as_ref()));
    std::mem::swap(test1.get_mut(), test2.get_mut());
    println!("a: {}, b: {}", Test::a(test2.as_ref()), Test::b(test2.as_ref()));
}

We tried to pin &mut Test on the stack and then tried to call get_mut() as an argument to std::mem::swap() and found that it didn’t compile. the Rust compiler stopped us from making the mistake from the compile stage.

1
2
3

  |     std::mem::swap(test1.get_mut(), test2.get_mut());
  |                          ^^^^^^^ within `Test`, the trait `Unpin` is not implemented for `PhantomPinned`
  |

Pin to the heap

use std::pin::Pin;
use std::marker::PhantomPinned;

#[derive(Debug)]
struct Test {
    a: String,
    b: *const String,
    _marker: PhantomPinned,
}

impl Test {
    fn new(txt: &str) -> Pin<Box<Self>> {
        let t = Test {
            a: String::from(txt),
            b: std::ptr::null(),
            _marker: PhantomPinned,
        };
        let mut boxed = Box::pin(t);
        let self_ptr: *const String = &boxed.as_ref().a;
        unsafe { boxed.as_mut().get_unchecked_mut().b = self_ptr };

        boxed
    }

    fn a<'a>(self: Pin<&'a Self>) -> &'a str {
        &self.get_ref().a
    }

    fn b<'a>(self: Pin<&'a Self>) -> &'a String {
        unsafe { &*(self.b) }
    }
}

pub fn main() {
    let mut test1 = Test::new("test1");
    let mut test2 = Test::new("test2");

    println!("a: {}, b: {}",test1.as_ref().a(), test1.as_ref().b());
    // std::mem::swap(test1.get_mut(), test2.get_mut());
    // std::mem::swap(&mut *test1, &mut *test2);
    println!("a: {}, b: {}",test2.as_ref().a(), test2.as_ref().b());
}

Here Box::pin() is used to pin Test to the heap. Uncommenting any line will fail to compile, because Test is !Unpin.

Future

#[stable(feature = "futures_api", since = "1.36.0")]
pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

Next, let’s talk about one of the most important applications of Pin at the moment: Future . When the Pin API was first introduced by the official asynchronous group in 2018, the original intention was to solve the problem of self-referencing within Future. Because async/await is implemented through Generator, Generator is implemented through anonymous structs. If there is a cross-await reference in the async function, it will cause the underlying Generator to have a cross-yield reference, and the anonymous structure generated according to Generator will be a self-referential structure! Then this self-referencing structure will be impl Future, and the asynchronous Runtime will need a variable borrow (i.e. &mut Self) when calling the Future::poll() function to query the state. If this &mut Self is not wrapped in Pin, the developer’s own impl Future will use a function like std::mem::swap() to move &mut Self! So that’s why poll() in Future has to use Pin<&mut Self>.

// There is a cross-await borrowing in this async block!
let mut fut = async {
    let to_borrow = String::from("Hello");
    let borrowed = &to_borrow;
    SomeResource::some_task().await;
    println!("{} world!", borrowed);
};

And of course there is a very important point not to forget! Pin only has a pinning effect on types that implement !Unpin, does this impl Future anonymous structure have impl !Unpin? Of course it does, there are only a few exceptions to the default !Unpin as mentioned earlier, and this anonymous structure is one of them.

pub const fn from_generator<T>(gen: T) -> impl Future<Output = T::Return>
where
    T: Generator<ResumeTy, Yield = ()>,
{
    #[rustc_diagnostic_item = "gen_future"]
    struct GenFuture<T: Generator<ResumeTy, Yield = ()>>(T);

    // We rely on the fact that async/await futures are immovable in order to create
    // self-referential borrows in the underlying generator.
    impl<T: Generator<ResumeTy, Yield = ()>> !Unpin for GenFuture<T> {}

    impl<T: Generator<ResumeTy, Yield = ()>> Future for GenFuture<T> {
        type Output = T::Return;
        fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
            // SAFETY: Safe because we're !Unpin + !Drop, and this is just a field projection.
            let gen = unsafe { Pin::map_unchecked_mut(self, |s| &mut s.0) };

            // Resume the generator, turning the `&mut Context` into a `NonNull` raw pointer. The
            // `.await` lowering will safely cast that back to a `&mut Context`.
            match gen.resume(ResumeTy(NonNull::from(cx).cast::<Context<'static>>())) {
                GeneratorState::Yielded(()) => Poll::Pending,
                GeneratorState::Complete(x) => Poll::Ready(x),
            }
        }
    }

    GenFuture(gen)
}

The focus impl<T: Generator<ResumeTy, Yield = ()>> !Unpin for GenFuture<T> {} , only what you see with your own eyes will convince everyone.

Other

Pin In addition to the above, there are several other concepts, such as Pin projection, Structural pin and Non-structural pin, which I do not use much myself.

There are also many APIs related to Pin in futures-rs, so if you use futures-rs in depth, you will inevitably need to deal with Pin frequently.

There are also many APIs related to Pin in futures-rs

Summary

The following is an excerpt from the official Async Book on Pin 8 summaries as a summary, these are almost the Pin API This is pretty much all there is to it.

If T: Unpin (which is the default), then Pin<‘a, T> is entirely equivalent to &‘a mut T. in other words: Unpin means it’s OK for this type to be moved even when pinned, so Pin will have no effect on such a type.
Getting a &mut T to a pinned T requires unsafe if T: !Unpin.
Most standard library types implement Unpin. The same goes for most “normal” types you encounter in Rust. A Future generated by async/await is an exception to this rule.
You can add a !Unpin bound on a type on nightly with a feature flag, or by adding std::marker::PhantomPinned to your type on stable.
You can either pin data to the stack or to the heap.
Pinning a !Unpin object to the stack requires unsafe
Pinning a !Unpin object to the heap does not require unsafe. There is a shortcut for doing this using Box::pin.
For pinned data where T: !Unpin you have to maintain the invariant that its memory will not get invalidated or repurposed from the moment it gets pinned until when drop is called. This is an important part of the pin contract.

Table of Contents

Concept

What exactly is a move?

Move of Self-Referential Structs

What is the root cause?

Pin

Pin’s contract

How to construct a Pin

Pin::new()

Pin::new_unchecked()

Other

Application of Pin

Pin to the stack

Pin to the heap

Future

Other

Summary