Is perception a direct encounter with the world, or is it always mediated by mental representations? This question has driven the philosophy of perception for over two millennia, dividing theorists into camps that often talk past each other. The history of the subfield is not a linear march toward consensus but a series of live disagreements, with each framework arising from a specific pressure point in its predecessor. By tracing these debates—from ancient claims of direct realism to contemporary predictive processing models—we can see why the tension between directness and mediation remains the field's central organizing problem.
Direct Realism is the intuitive starting point: when you see a tree, you perceive the tree itself, not an image or representation. This view was dominant in ancient philosophy, particularly in Aristotle's account of perception as the reception of forms without matter. The tree's form is directly present to the mind. But direct realism faces a serious challenge: if perception is direct, how do we explain illusions, hallucinations, and the fact that the same object can appear differently under different conditions? Representationalism emerged in the 17th century as a systematic answer to these problems. Theorists like Descartes and Locke argued that we perceive not external objects directly, but 'ideas' or representations that are caused by objects. Perception becomes a two-step process: the object causes a mental representation, and we perceive the representation. This solved the puzzle of illusion—a bent stick in water still produces a straight representation—but it introduced a new problem: a 'veil of perception' that seems to separate us from the world.
Idealism took representationalism to its radical conclusion: if all we ever perceive are our own ideas, then we cannot know that an external world exists. Berkeley argued that esse est percipi: to be is to be perceived. Objects are collections of ideas, sustained by God. Idealism thus replaced the external world with a world of perceptions, collapsing the distinction between appearance and reality. It flourished in the 18th and 19th centuries but receded as a live option in the 20th, though it left a lasting mark on debates about the nature of perception.
Transcendental Idealism, Kant's response, tried to preserve both directness and mediation. Kant agreed with representationalists that we do not perceive things-in-themselves (noumena), but he argued that the objects of perception (phenomena) are not mere subjective ideas: they are structured by our own cognitive faculties (space, time, categories). Phenomena are empirical realities, not private mental images. This framework absorbed the representationalist insight that experience is shaped by the mind while rejecting the idealist claim that there is no world at all outside perception. Transcendental idealism remains active today as a resource for those who want to avoid both naïve realism and full-blown idealism.
Phenomenology, launched by Husserl around 1900, redirected attention from the epistemology of perception to its structures of experience. Rather than asking whether we perceive objects directly or indirectly, phenomenologists describe the intentional character of perception—the way consciousness is always directed at something. For Husserl, perception involves a horizon of anticipations; for Merleau-Ponty, the body is the perceiving subject. Phenomenology coexists with the earlier frameworks by bracketing the existence of the external world and focusing on lived experience. It offers a descriptive richness that analytic theories often lack, and it continues to inform contemporary debates about embodiment.
In the analytic tradition, Sense-Data Theory (roughly 1900–1970) gave a precise formulation of indirect realism. Sense-data are private, mind-dependent entities that we directly perceive; material objects are inferred or constructed from them. This view, championed by Russell, Moore, and Price, provided a clear account of illusion and hallucination (the sense-data are real even when the object is not), but it faced severe criticism: if sense-data are private, how can we compare them across perceivers? And if we only perceive sense-data, can we even know that material objects exist?
The Causal Theory of Perception, articulated by Grice and others in the 1960s, offered a different way to handle the problems. Instead of positing intermediaries, the causal theory argued that perception is distinguished from hallucination by its causal origin: a perception is veridical if it is caused by an external object in the right way. This preserved a form of direct realism (we perceive objects, not representations) while acknowledging that the causal chain can go wrong. The causal theory coexists with representationalism; it does not require mental representations but does not exclude them either.
Disjunctivism, emerging in the late 20th century, is a direct challenge to the representationalist assumption that veridical perception and hallucination share a 'common factor' (e.g., the same type of mental state). Disjunctivists like Hinton and McDowell argue that the nature of a perceptual experience is partly determined by its object: a case of really seeing a tree is fundamentally different from an indistinguishable hallucination. There is no single mental state common to both. This view directly attacks the representationalist idea that the mind 'adds' a representation to the causal process. Disjunctivism remains a live, minority position, celebrated for its fidelity to the phenomenology of direct realism but criticized for making it hard to explain what hallucination and perception have in common at the neural level.
A major shift came with the Ecological Approach to Perception, pioneered by J. J. Gibson from the 1950s. Gibson rejected the entire representationalist framework. Perception is not a process of constructing internal models; it is the direct pickup of information available in the environment. 'Affordances' are properties of the environment that relate to an animal's possibilities for action—a chair affords sitting. Perception is not a mental representation but an active exploration. Gibson's approach replaced the representationalist idea of internal processing with a focus on the organism–environment coupling.
Enactivism, emerging in the 1990s, extended Gibson's insights while adding a phenomenological dimension. Inspired by Varela, Thompson, and Rosch, enactivism argues that perception is not a passive reception of information but a form of action: we perceive by interacting with the world, and the world we perceive is shaped by our sensorimotor skills. Enactivism absorbs the ecological emphasis on action but also emphasizes the role of the body and the lived experience of perception. It preserves the directness claim of the ecological approach while rejecting the need for any internal representation. Enactivism remains a major force in contemporary philosophy of mind, especially in debates about consciousness.
Predictive Processing, which became prominent around 2000, offers a powerful new synthesis. The brain is constantly generating predictions about sensory input; perception occurs when these predictions are matched against actual sensory data, with errors driving updates to the model. This framework revives representationalism in a probabilistic, dynamic form: the brain does not passively register static images but actively infers the causes of its sensations. Predictive processing explains a wide range of phenomena—from perceptual illusions to the role of attention—and has been embraced in cognitive science and philosophy. It coexists with enactivism in a productive tension: both agree that perception is active and inferential, but they disagree about whether inference requires internal representations. Predictive processing explicitly commits to hierarchical representations, while enactivists maintain that representation can be avoided. This is one of the most vibrant debates in the field today.
Today, the leading frameworks are Predictive Processing, Enactivism, and Disjunctivism, each occupying a different position on the directness–mediation spectrum. They agree that perception is an active, constructive process rather than passive registration. But they disagree on what that construction involves: representations (Predictive Processing), sensorimotor engagement (Enactivism), or object-involving states (Disjunctivism). Direct Realism remains the default philosophical intuition, but most theorists acknowledge that it must accommodate illusions and hallucinations. Representationalism, in its new predictive form, is arguably the most empirically fruitful framework. Enactivism offers a compelling alternative that resonates with embodied cognition. Disjunctivism provides a sharp challenge to any view that posits a common factor. The field is thus in a state of pluralism: no single framework has won, and the core tension between directness and mediation continues to drive inquiry.