Mar 15, 20264 min read

Managing Concurrency in AI Workflows: Parallel Branches and Shared State

How we handle concurrent workflow execution in Rust using tokio::spawn and RwLock, the limitations we've hit, and where we're going next.

Concurrency in Rust

When building an AI orchestration platform, you eventually hit the concurrency problem. If a workflow needs to search the web, query a database, and summarize a document, doing those sequentially wastes time. You need parallel branches.

But how do you manage the state of a complex, long-running AI workflow across multiple parallel executions?

In my system, we rely on a deterministic workflow engine built in Rust. Here is how we actually handle concurrency in production today—and why we are planning to change it.

The Current Reality: Shared Mutable State

If you look at the core of our workflow engine, you won't find an Actor Model (yet). You'll find standard Rust concurrency primitives.

The entire execution state is wrapped in an Arc<RwLock<WorkflowExecution>>.

// The execution state is shared across parallel branches
let execution = Arc::new(RwLock::new(initial_state));

When a workflow hits a ParallelNode, the engine fans out. We use tokio::spawn to kick off concurrent branches.

let mut handles = vec![];

for branch in parallel_branches {
    let state_clone = Arc::clone(&execution);

    handles.push(tokio::spawn(async move {
        // Execute branch logic
        let result = execute_branch(branch).await;

        // Acquire write lock to update the shared execution state
        let mut state = state_clone.write().await;
        state.apply_result(result)?;

        Ok::<_, anyhow::Error>(())
    }));
}

// Collect results from all branches
for handle in handles {
    match handle.await {
        Ok(Ok(_)) => { /* branch succeeded */ }
        Ok(Err(e)) => { /* branch returned an error */ }
        Err(e) => { /* task panicked */ }
    }
}

Why This Works (For Now)

  1. Simplicity: The mental model is straightforward. Fan out, do work, acquire lock, write result, fan in.
  2. Speed: tokio::spawn is incredibly lightweight. We can spawn parallel branches with minimal overhead.
  3. Rust's Type System Guarantees: Rust's ownership model and Send/Sync traits force us to use a synchronization primitive like RwLock to share mutable state across tasks—the code simply won't compile without it. The lock itself then enforces exclusive access at runtime.

The Limitations We Hit

This shared-state model got us to production, but it has cracks showing under the weight of AI workloads.

1. Lock Contention

As workflows get more complex, branches spend more time fighting for the RwLock. If one branch is trying to read the state to format a prompt, and another branch is writing a massive tool result, the reader must yield and wait for the lock to be released.

2. No Nested Agent Invocations

Currently, we explicitly stub out and do not support invoking nested agents inside parallel branches. Why? Because managing the lifecycle, memory isolation, and state mutation of an entirely separate agent within a shared-state parallel branch is a recipe for deadlock.

3. No Built-In Supervision

Tokio isolates panics per task, so one branch crashing won't bring down others. But without a dedicated supervisor, recovery is manual. If a branch panics or hangs indefinitely (a common issue when dealing with third-party LLM APIs), we don't have a clean way to automatically restart just that branch. The workflow engine waits, and eventually, the whole execution times out.

The Road Ahead: The Actor Model

We've reached the limits of Arc<RwLock<T>> for AI orchestration. Our next major architectural shift is moving to the Actor Model.

Instead of sharing state, each agent will become an isolated actor with its own private state and a mailbox (tokio::mpsc channels).

When we implement this:

  • No Shared Mutability: Agents will process state updates sequentially via messages.
  • Supervision: A supervisor will track JoinHandles and automatically restart unresponsive agents with exponential backoff.
  • Backpressure: Bounded channels will replace fire-and-forget spawns, giving us proper backpressure when the system is under heavy load.

For now, the RwLock and tokio::spawn pattern holds the line. It's not perfect, but it's a pragmatic reality of building fast. We didn't over-engineer an actor system before we needed it.

But the time for actors is coming.

Enjoyed this article?

Share it with others or connect with me