Every concurrent program must answer a fundamental question: how should simultaneous computations coordinate access to shared resources and manage their lifetimes? Over six decades, programming language designers have proposed radically different answers, each embodied in a distinct concurrency model. These models—shared-memory concurrency with locks and monitors, message-passing concurrency (CSP and the Actor Model), data parallelism, transactional memory, the fork-join model, and structured concurrency—represent competing technical agendas that have shaped how programmers think about parallelism. The history of these models is not a smooth progression but a series of debates about whether shared state can be tamed, whether communication should be synchronous or asynchronous, and how to structure task lifetimes.
The earliest systematic framework for concurrency in programming languages was shared-memory concurrency, built on locks and monitors. Emerging in the 1960s with systems like the Burroughs B5000 and later formalized in languages such as Concurrent Pascal and Java, this model treats threads as lightweight processes that share an address space. Synchronization is achieved through mutual exclusion: locks guard critical sections, and monitors (a higher-level abstraction) bundle shared data with the procedures that access it. The core commitment of this framework is that shared mutable state is the natural mapping to hardware, and that programmers can manage it with explicit locking. However, this commitment comes with severe trade-offs. Locks do not compose: combining two lock-protected operations often introduces deadlock or requires exposing internal locks. Reasoning about interleavings becomes exponentially harder as the number of threads grows. Despite these problems, shared-memory concurrency remains dominant in systems programming because it offers fine-grained control and maps directly to the hardware's cache-coherent memory model.
In direct contrast to shared-memory concurrency, message-passing models eliminate shared mutable state entirely. Two influential frameworks emerged in the 1970s: Communicating Sequential Processes (CSP) and the Actor Model. CSP, introduced by Tony Hoare, treats concurrent processes as independent entities that communicate through synchronous channels—a process sending a message blocks until the receiver is ready. The Actor Model, pioneered by Carl Hewitt, replaces channels with asynchronous message passing: each actor has a mailbox and processes messages one at a time, creating new actors and sending messages without blocking. Both frameworks reject the idea that threads should share memory, instead isolating state within processes or actors. This isolation makes reasoning about concurrency easier: there are no race conditions because there is no shared state. The two frameworks later diverged in influence: CSP inspired languages like Go (with goroutines and channels), while the Actor Model shaped Erlang and Akka. Message-passing remains the dominant model for distributed systems, where shared memory is physically impossible.
While shared-memory and message-passing models focus on coordinating threads, data parallelism takes a different approach: it asks how to apply the same operation to many data elements simultaneously. Originating in the 1970s with SIMD machines and array languages like APL, data parallelism treats collections as first-class values and applies operations in bulk. The programmer specifies what to compute, not how to parallelize it. This model differs from both shared-memory and message-passing by abstracting away threads entirely; the runtime or hardware decides how to distribute work across processing units. Data parallelism excels for uniform operations on large arrays (e.g., image processing, matrix multiplication) but struggles with irregular or task-dependent parallelism. Its influence persists in modern frameworks like MapReduce, CUDA, and the parallel streams in Java and .NET.
The fork-join model, formalized in the 1990s with languages like Cilk and later adopted in Java's ForkJoinPool, structures parallelism as recursive divide-and-conquer. A task forks subtasks that run in parallel, then joins them to combine results. This model refines the flat thread pool by introducing a hierarchical task tree and work-stealing: idle threads steal tasks from busy threads to balance load. Fork-join differs from data parallelism in that it handles irregular, recursive workloads (e.g., quicksort, merge sort) rather than uniform data operations. It also contrasts with shared-memory concurrency by providing a structured pattern for parallelism, reducing the need for ad-hoc synchronization. The fork-join model remains a cornerstone of parallel algorithm design in languages like Java, C++, and Rust.
Transactional memory, proposed in 1993 by Maurice Herlihy and J. Eliot B. Moss, directly addresses the composability failures of locks. Borrowing from database transactions, it allows programmers to mark blocks of code as atomic: the runtime optimistically executes the block, rolling back if a conflict occurs. This model promises to eliminate deadlock and make concurrent code composable—two transactions can be combined without exposing internal locks. However, transactional memory has seen limited adoption in mainstream languages. Hardware support (e.g., Intel TSX) proved fragile, and software implementations incurred high overhead. The framework remains an active research area, with some influence on languages like Clojure (software transactional memory) and Haskell (STM). It coexists with locks rather than replacing them, offering an alternative for specific use cases where composability is critical.
The most recent framework, structured concurrency (2015–present), synthesizes lessons from earlier models. Proposed by Martin Sústrik and later popularized in languages like Kotlin, Swift, and Java (Project Loom), it enforces that every task has a well-defined scope: a parent task cannot complete until all its child tasks finish, and cancellation propagates automatically through the hierarchy. This solves the resource leak and cancellation problems that plague fork-join and message-passing models, where tasks can outlive their parent or be left orphaned. Structured concurrency extends fork-join's hierarchy by adding lifetime guarantees, and it narrows the flat task model of earlier frameworks by requiring structured scopes. It does not reject shared state or message-passing but provides a disciplined framework for managing task lifetimes, making concurrent code easier to reason about and debug.
Today, no single concurrency model dominates. Shared-memory concurrency remains essential for low-level systems programming, where performance and hardware control are paramount. Message-passing is the default for distributed systems and microservices, with CSP and Actor Model implementations thriving in Go, Erlang, and Elixir. Data parallelism powers machine learning frameworks and GPU computing. The fork-join model is the standard for parallel algorithms in mainstream languages. Structured concurrency is gaining traction in newer languages and runtimes, promising safer and more composable task management. The leading frameworks agree on several principles: concurrency should be composable, resource lifetimes should be managed automatically where possible, and shared mutable state should be minimized or isolated. They disagree on the fundamental unit of abstraction—threads, tasks, processes, or actors—and on whether shared state can ever be safe enough for general use. This pluralism reflects the diversity of hardware and application domains, and the field continues to evolve as new challenges emerge.
The trajectory of concurrency models reveals a steady shift from manual, error-prone synchronization toward automatic, composable, and scoped management. Early models like locks and monitors embraced shared state but paid a heavy price in complexity. Message-passing offered an escape by eliminating sharing, while data parallelism and fork-join provided structured patterns for specific kinds of parallelism. Transactional memory attempted to fix locks without abandoning shared state, and structured concurrency now provides a disciplined framework for task lifetimes. Each model accepted trade-offs, and the field's vitality comes from the ongoing debate about which trade-offs are worth making. Understanding this history equips programmers to choose the right model for their problem and to anticipate the next wave of innovation.