Rust - Accessing the Id Manager from multiple threads

Previously published

This article was previously published on len-learns-rust.com. A full index of these articles can be found here.

Since I now understand a little about how to share data between threads I can try and use my Id Manager from multiple threads.

Following the same pattern as I’ve been using with the other threading code, something like this might work…

    #[test]
    fn test_channel_thread_with_id_manager() {
        let id_manager = Arc::new(IdManager::<u8>::new(ReuseSlow));
        let shared_manager = Arc::clone(&id_manager);

        let data = Arc::new(Mutex::new(HashMap::<String, Id<u8>>::new()));
        let shared_data = Arc::clone(&data);

        let mut thread = ChannelThread::new(move |message| {

            let id = shared_manager.allocate_id();

            println!("got message {} - {}", message, id.value());

            shared_data.lock().expect("failed to lock data").insert(message, id);

            return true;
        });

However, the compiler doesn’t like that, and, for once, the error message doesn’t immediately point me in the direction of a simple fix…

Compiler says "no!"
error[E0521]: borrowed data escapes outside of closure
  --> src/lib.rs:99:13
   |
91 |         let shared_data = Arc::clone(&data);
   |             ----------- `shared_data` declared here, outside of the closure body
...
99 |             shared_data.lock().expect("failed to lock data").insert(message, id);
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

During the development of the Id Manager I came up against a couple of issues when I was adding a SmartId. This struct acted as a RAII1 wrapper around the Id and would manage the lifetime of an Id and deal with returning it to the manager when the SmartId was dropped. I fixed these issues with explicit lifetime parameters2 and a std::sync::Mutex<> and then realised that I had stumbled upon the “Interior Mutability Pattern”.

In the code above, the std::sync::Mutex<> allows us to access the id manager as mutable in both threads and the std::sync::Arc allows us to manage any potential lifetime issues with the manager itself when used from different threads, but there’s obviously something else wrong here.

It shows how good the compiler errors usually are that this is the first time I’ve come across one that either doesn’t provide an immediate solution or provides enough information in the message to find a fix simply by looking up the error or googling.

This one was hard as I initially spent time trying to work out what was wrong with the HashMap<> but the issue is more subtle, it’s not the map itself but what’s being stored in it that is causing the problem. Eventually, through trial and error, I switched to storing the u8 value of the id in the map and the problem went away.

The error is actually saying that it’s the lifetime of the Id that is the problem. Looking at the SmartId it soon becomes clear what the issue is. The SmartId that I have here, which is the latest version I had when I started writing this piece, looks like this:

pub struct SmartId<'a, T: IdType> {
    manager: &'a Mutex<IdManager<T>>,
    id: T,
    we_own_id: bool,
}

This has the explicit lifetime and the lockable IdManager that solved our problems when accessing the id from a single thread. In C++ I would probably have just had a bare reference to the IdManager and relied on “the programmer being sensible” to ensure that the SmartId never outlived the associated manager. Rust is sensible enough to know that that approach relies on the programmer not being an idiot, now, or any time in the future. It’s fragile and unreliable and a potential source of hard to find bugs. So Rust doesn’t allow it. We had to tell the compiler that the lifetime of the Id depended on the lifetime of the manager and so allow it to make sure that a future version of me didn’t break the code by mistake. This is a very good thing. This isn’t good enough for multithreaded use though. We now need to apply the same transformation that we needed to access data from multiple threads and make the manager a reference counted object and have the SmartId hold a reference to the manager that has allocated it…

So, we need this change.

pub struct SmartId<T: IdType> {
    manager: Arc<Mutex<IdManager<T>>>,
    id: T,
    we_own_id: bool,
}

The id is now responsible for keeping the manager alive, so the lifetime issue is resolved. Of course, we also have to adjust the rest of the code that was previously using a Mutex<IdManager<T>> to use an Arc<Mutex<IdManager<T>>> for this to compile and that includes changing the ThreadSafeIdManager<> to this:

pub struct ThreadSafeIdManager<T: IdType> {
    manager: Arc<Mutex<IdManager<T>>>,
}

With the new IdManager code from here we can successfully compile and run our multithreaded code using the IdManager and we can safely pass Ids from one thread to another. In the code below we allocate the Ids in the thread and then store them in a shared map and deallocate them in the main thread.

    #[test]
    fn test_channel_thread_with_id_manager() {
        let id_manager = Arc::new(IdManager::<u8>::new(ReuseSlow));
        let shared_manager = Arc::clone(&id_manager);

        let data = Arc::new(Mutex::new(HashMap::<String, Id<u8>>::new()));
        let shared_data = Arc::clone(&data);

        let mut thread = ChannelThread::new(move |message| {

            let id = shared_manager.allocate_id();

            println!("got message {} - {}", message, id.value());

            shared_data.lock().expect("failed to lock data").insert(message, id);

            return true;
        });

        for i in 1..15 {
            println!("sending {} to thread", i);
            thread.send(i.to_string());
        }

        println!("ids: {}", id_manager.dump());

        println!("close channel, signal thread we're done");

        thread.shutdown();

        println!("wait for thread to end");

        thread.join();

        println!("ids: {}", id_manager.dump());

        {
            let data = data.lock().expect("failed to lock data");

            for named_id in data.iter() {
                println!("id: {} - {}", named_id.0, named_id.1.value());
            }
        }

        {
            let mut data = data.lock().expect("failed to lock data");

            let keys : Vec<String>  = data.keys().cloned().collect();

            for key in keys {
                if let Some(id) = data.remove(&key) {
                    println!("id: {} - {}", key, id.value());

                    println!("ids: {}", id_manager.dump());
                }

            }
        }

        println!("ids: {}", id_manager.dump());

        println!("all done...");
    }

Since we now have the IdManager using an Arc internally there’s no need for the additional Arc in the code that uses it, so we can change the code to this:

    #[test]
    fn test_channel_thread_with_id_manager() {
        let id_manager = IdManager::<u8>::new(ReuseSlow);
        let shared_manager = id_manager.clone();

        let data = Arc::new(Mutex::new(HashMap::<String, Id<u8>>::new()));
        let shared_data = Arc::clone(&data);

        let mut thread = ChannelThread::new(move |message| {

            let id = shared_manager.allocate_id();

            println!("got message {} - {}", message, id.value());

            shared_data.lock().expect("failed to lock data").insert(message, id);

            return true;
        });

But to do that we need to implement Clone trait for the Id Manager.

#[derive(Clone)]
pub struct ThreadSafeIdManager<T: IdType> {
    manager: Arc<Mutex<IdManager<T>>>,
}

When I first approached Rust I found the whole concept of lifetimes and mutability a little complex. I immediately appreciated the safety, but I couldn’t quite see how it could work in practice. These experiments with threading have helped me to see that whilst the borrow checker and lifetimes can provide very strong guarantees about the static state of the code that you write it’s sometimes necessary to move the checking and control to runtime and use Arc to explicitly manage the lifetime or Mutex to explicitly allow you to control how mutability is accessed. Obviously both of these come with more performance costs and should only be used when necessary.

Join in

The code can be found here on GitHub each step on the journey will have one or more separate directories of code, so this article’s code is here and the updated IdManager is here this allows for easy comparison of changes at each stage.

Of course, there may be a better way; leave comments if you’d like to help me learn.