The history of virtual reality in computer graphics is driven by a persistent tension: should mediated experience transport a user away from the physical world, or should it enhance the world already around them? Five major frameworks have emerged as competing answers to this question, each defining a distinct relationship between the user, the computer, and reality. Their story is not a simple succession of technologies but a branching network of commitments, disagreements, and partial convergences that continues to shape the field today.
The earliest framework, Immersive Simulation (1962–1990), took the most radical position: replace the user's sensory experience of the real world entirely with a synthetic one. Its defining ambition was to create a self-contained illusion—a multisensory environment delivered through head-mounted displays, motion platforms, and spatial audio. The Sensorama Simulator, patented in 1962, embodied this vision by combining stereoscopic 3D film, stereo sound, wind, and even smells into a cabinet-sized booth. The goal was total immersion, achieved by blocking out the physical environment and substituting a pre-recorded or computer-generated alternative.
Immersive Simulation was a hardware-centric program. Its practitioners focused on engineering the display and tracking systems necessary to sustain the illusion: wide-field optics, low-latency head tracking, and tactile feedback devices. By the late 1980s, however, the framework's limitations became clear. The hardware required to produce convincing sensory replacement was expensive, bulky, and prone to causing motion sickness. More fundamentally, the framework offered no systematic way to model the content of the simulated world—it treated the synthetic environment as a fixed scene rather than an interactive, software-defined space. As a dominant research program, Immersive Simulation declined, but it left behind crucial infrastructure: the head-mounted display, the motion tracker, and the very concept of immersion as a design goal.
Telepresence (1980–Present) emerged from a different pressure: the desire to extend human manipulation and perception to remote physical locations. Marvin Minsky, who coined the term in 1980, envisioned a system in which a human operator controls a robot that sees, hears, and touches a distant environment, effectively projecting the operator's presence across space. Unlike Immersive Simulation, Telepresence does not aim to replace reality with a synthetic one. Instead, it seeks to make the real world accessible at a distance, preserving the full richness of physical interaction—including haptic feedback, dexterous manipulation, and natural visual cues.
This framework's distinctive commitment is to agency rather than immersion. The measure of success is not how convincingly the system blocks out the user's immediate surroundings, but how effectively it enables the user to perform tasks in a remote environment. Telepresence therefore prioritizes low-latency video, force-feedback manipulators, and stereo vision over the wide-field displays and pre-rendered scenes favored by Immersive Simulation. It coexists with the earlier framework by inheriting its display hardware while rejecting its goal of synthetic world-building. Telepresence remains active today in domains such as telesurgery, deep-sea exploration, and space robotics, where the real world is too dangerous or distant for direct human presence.
Virtual Environments (1985–Present) shifted the emphasis from hardware to software. Emerging from NASA's VIEW (Virtual Interface Environment Workstation) project and later refined at Ames Research Center, this framework defined the virtual world as a computer-generated, interactive space that could be programmed and reconfigured. The landmark 1992 bibliographical essay "Nature and Origins of Virtual Environments" codified this view, describing the virtual environment as a new medium for communication—one in which the user is surrounded by a dynamic, responsive digital space.
Virtual Environments differs from Immersive Simulation in a crucial way: it treats the synthetic world as a programmable artifact, not a fixed sensory illusion. The framework's core research questions concern how to model, render, and interact with 3D digital spaces in real time. This includes techniques for collision detection, physics simulation, and gesture-based input—problems that Immersive Simulation had largely ignored. Compared to Telepresence, Virtual Environments is synthetic rather than real-world: it builds worlds from scratch rather than transmitting images of existing places. The two frameworks share an interest in head-mounted displays and hand tracking, but they diverge on whether the user's experience should be anchored to physical reality or to an invented one. Virtual Environments remains a living tradition, now deeply integrated with game engines, social VR platforms, and scientific visualization tools.
Augmented Reality (1992–Present) broke decisively with the immersive tradition. Instead of replacing the user's view of the physical world, it overlays computer-generated information onto that view. The first systematic demonstration came from Boeing researchers in 1992, who used a see-through head-mounted display to project wireframe diagrams onto physical aircraft assemblies, helping workers locate wiring harnesses. The core principle is registration: digital content must be aligned with real-world objects in three dimensions, tracking the user's head and eye movements to maintain the illusion of co-location.
Augmented Reality rejects the full-world replacement of both Immersive Simulation and Virtual Environments. Its goal is not to transport the user elsewhere but to augment the here and now. This commitment brings a distinct set of technical challenges: accurate 6-degree-of-freedom tracking, real-time scene understanding, and low-latency rendering that does not break the user's connection to the physical environment. The framework also introduces a new interaction model: instead of manipulating a purely digital world, the user interacts with a hybrid space where physical objects and digital annotations coexist. Augmented Reality has proven especially valuable in industrial maintenance, medical visualization, and navigation, where context-sensitive information can be delivered without pulling the user's attention away from the task at hand.
Mixed Reality (1994–Present) emerged as a conceptual framework that subsumes both Augmented Reality and Virtual Environments under a single organizing idea: the reality-virtuality continuum. Introduced by Paul Milgram and colleagues in 1994, this continuum places the real environment at one end and a fully virtual environment at the other, with Augmented Reality and Augmented Virtuality occupying intermediate positions. The key insight is that there is no sharp boundary between augmenting reality and immersing in a virtual world—rather, there is a continuous spectrum of possible blends.
Mixed Reality differs from Augmented Reality by treating the blending of real and virtual as a design space rather than a fixed overlay. Where Augmented Reality typically assumes that the real world remains primary and digital content is added on top, Mixed Reality allows for more fluid relationships: virtual objects can be occluded by real ones, cast shadows on physical surfaces, or respond to real-world physics. This framework has driven the development of spatial computing—the ability to map and understand the physical environment in real time, enabling digital content to behave as if it truly inhabits the user's space. Mixed Reality does not replace Augmented Reality or Virtual Environments; it provides a theoretical language for describing their relationship and a research agenda for building systems that move along the continuum dynamically.
Today, Augmented Reality and Mixed Reality are the most active frameworks, with Virtual Environments and Telepresence continuing as specialized traditions. The leading frameworks agree on several technical priorities: the need for lightweight, high-resolution see-through displays; robust inside-out tracking; and real-time scene understanding through computer vision. They also share a growing emphasis on social interaction—enabling multiple users to share a mediated space, whether that space is fully virtual, augmented, or somewhere in between.
Where they disagree is on the primacy of the real world. Augmented Reality insists that the physical environment should remain the user's primary reference frame, with digital content serving as a supplementary layer. Mixed Reality, by contrast, treats the real and virtual as co-equal partners, allowing the system to decide how much of each to present based on the task and context. A deeper debate concerns embodiment: Telepresence argues that mediated experience must include haptic feedback and physical manipulation to feel authentic, while Virtual Environments and Mixed Reality have increasingly explored purely visual and auditory presence as sufficient for many applications. These disagreements are not signs of fragmentation but of a healthy field exploring the full range of possibilities opened by the five frameworks. The central tension between escaping reality and enhancing it remains unresolved—and that is precisely what keeps the field moving forward.