Classical game theory, built on assumptions of perfect rationality and pure self-interest, offered elegant predictions for strategic interaction. Yet from the earliest laboratory experiments in the 1980s, human behavior systematically diverged from those predictions. Participants in ultimatum games rejected unfair offers even when it cost them money. Prisoner's Dilemma players cooperated in one-shot encounters where defection was the rational choice. These anomalies were not random noise; they revealed systematic patterns that demanded new theoretical frameworks. Behavioral game theory emerged to build models that could explain how real people actually play games, preserving the strategic logic of game theory while replacing its unrealistic psychological assumptions.
The first wave of behavioral game theory focused on what people want. Classical theory assumed players cared only about their own material payoffs. Social Preference Models replaced that assumption with richer motivational structures. The most influential early model, inequity aversion (Fehr and Schmidt, 1999), proposed that players dislike unequal outcomes: they will sacrifice personal gain to reduce disparities, whether advantageous or disadvantageous. A complementary framework, reciprocity and intention-based fairness (Rabin, 1993; Dufwenberg and Kirchsteiger, 2004), argued that players care not just about final distributions but about the intentions behind others' actions—kindness is rewarded, unkindness punished, even at a cost.
These models coexisted in a productive tension. Inequity aversion was mathematically simpler and fit a wide range of experimental data, from ultimatum games to public goods games. Reciprocity models captured richer psychological nuance but were harder to apply. Both frameworks transformed the field by showing that departures from self-interest were not irrational mistakes but reflections of different preferences. They did not, however, address whether players could actually compute the optimal strategies in complex games, or how they learned from experience. Those questions opened the door to the next frameworks.
Even when players share the same social preferences, they may lack the cognitive capacity to reason through multi-step strategic interactions. Bounded Rationality in Games addressed this limitation by modeling how players with limited foresight actually think. The most widely used approach, level-k reasoning (Nagel, 1995; Stahl and Wilson, 1995), assumes a hierarchy of strategic depth. A level-0 player chooses randomly or by a simple rule. A level-1 player best-responds to level-0. A level-2 player best-responds to level-1, and so on. Cognitive hierarchy models (Camerer, Ho, and Chong, 2004) extended this idea by allowing players to have heterogeneous beliefs about the distribution of others' levels.
This framework narrowed the scope of explanation compared to Social Preference Models: it did not try to explain what players wanted, only how they thought. But it absorbed a key insight from the earlier work—that social motives and limited reasoning could both matter in the same game. A player might be inequity-averse yet only capable of level-1 reasoning. The two frameworks thus complemented each other, with Bounded Rationality in Games providing the cognitive infrastructure that Social Preference Models had left unspecified.
Social Preference Models and Bounded Rationality in Games both treated behavior as static: players had fixed preferences and fixed reasoning depths. But many strategic interactions are repeated, and players adapt. Learning Models in Games emerged to explain how behavior changes over time. Reinforcement learning (Erev and Roth, 1998) assumed that players repeat actions that yielded high payoffs in the past, without forming explicit beliefs about others. Belief-based learning (Fudenberg and Levine, 1998) assumed that players update their predictions about others' strategies and then choose best responses. Experience-weighted attraction (EWA; Camerer and Ho, 1999) unified both approaches into a single framework that could flexibly capture different mixtures of reinforcement and belief-based updating.
These models did not replace Social Preference Models or Bounded Rationality in Games; they operated at a different level of analysis. A learning model could be combined with any preference specification or any reasoning rule. The key contribution was to transform behavioral game theory from a static to a dynamic science. Researchers could now ask not just why people play a certain way, but how they get there and whether they converge to equilibrium over time.
The fourth major framework, Quantal Response Equilibrium (QRE; McKelvey and Palfrey, 1995), addressed a different kind of limitation: decision noise. Even when players have clear preferences and full reasoning ability, their choices are not perfectly deterministic. QRE replaced the classical assumption of exact best response with a probabilistic choice rule: actions with higher expected payoffs are chosen more often, but not always. The equilibrium is a fixed point where each player's noisy best responses are consistent with the distribution of others' noisy choices.
QRE coexists with the other frameworks as a general statistical infrastructure. It can be layered on top of Social Preference Models (by adding noise to inequity-averse choices) or Bounded Rationality in Games (by allowing level-k players to tremble). Its relationship to Learning Models is more complex: QRE is a static equilibrium concept, while learning models describe dynamics. But QRE often predicts the same long-run outcomes as learning models with noise, creating a productive tension between equilibrium and process explanations.
Today, behavioral game theory treats these four frameworks as complementary tools rather than competing paradigms. Social Preference Models remain the standard way to explain other-regarding behavior in one-shot games. Bounded Rationality in Games is the default framework for games requiring multi-step reasoning, such as beauty contest games or auction formats. Learning Models are essential for repeated interactions, from bargaining to market experiments. QRE provides the statistical backbone for estimating all of these models from data, allowing researchers to quantify how much of observed behavior is due to preferences, cognition, learning, or noise.
Yet important disagreements persist. One debate concerns the hierarchy of explanations: should social preferences be treated as fundamental, with bounded rationality and noise as secondary complications, or should cognitive limits be the starting point, with social motives emerging from repeated interaction? Another debate centers on unification: some researchers advocate for a single grand model that integrates all four mechanisms, while others argue that strategic contexts are so diverse that researchers should select the simplest framework that fits the data. A third disagreement concerns the role of equilibrium: QRE assumes that players' noisy choices are mutually consistent, but learning models show that real players may never reach equilibrium, raising questions about whether equilibrium concepts should be the goal of behavioral game theory at all.
Despite these debates, the field has converged on a core insight: strategic behavior is shaped by multiple mechanisms that interact in context-dependent ways. The leading frameworks agree that classical game theory's assumptions of perfect rationality and pure self-interest are inadequate, but they disagree on which psychological mechanisms are most fundamental and how they should be modeled. This pluralism is a sign of maturity, not fragmentation. Each framework has its domain of best application, and the frontier of research lies in understanding how they combine—for example, how social preferences interact with limited reasoning in auction design, or how learning dynamics interact with decision noise in policy evaluation.
Behavioral game theory has thus moved from a critique of classical theory to a constructive research program. Its frameworks are now routinely used to design better markets, predict behavior in strategic environments, and inform policy. The progression from social motives to cognitive limits to dynamic adaptation to stochastic choice reflects not a linear replacement of old ideas by new ones, but a steady accumulation of explanatory resources. The field's current strength lies in its ability to draw on all four frameworks, selecting and combining them as the strategic context demands.