A researcher studying the effect of minimum wage increases on employment across U.S. states over time faces a persistent challenge: each state has unobserved, time-invariant characteristics—political culture, industrial composition, labor market institutions—that influence both the policy and the outcome. Panel data, which track the same units across multiple periods, offer a way to control for such unit-specific heterogeneity. But the choice of how to do so has evolved into a rich methodological toolkit, with each generation of methods addressing limitations of its predecessors while opening new questions.
The earliest systematic approach to panel data in econometrics, the Random Effects and Fixed Effects Approach (1966), established the core insight: if unobserved unit-specific effects are constant over time, they can be either modeled as random draws from a distribution (random effects) or removed by differencing or demeaning (fixed effects). The fixed effects estimator is consistent even when the unit effects are correlated with the regressors, but it discards between-unit variation and can be inefficient. Random effects is more efficient when the effects are uncorrelated with the regressors—an assumption testable by the Hausman test. This framework gave applied researchers a clear rule: use fixed effects to avoid bias from omitted time-invariant confounders, at the cost of losing explanatory power from slowly changing variables. Yet its strict exogeneity assumption—that past, present, and future errors are uncorrelated with all regressors—proved restrictive as empirical work moved toward models with lagged dependent variables and categorical outcomes.
Two related frameworks emerged to handle settings where the linear fixed effects model was inadequate. The Non-Linear Panel Data Framework (1980) addressed the fact that many economic outcomes—employment status, firm entry, union membership—are binary, ordered, or counted. Fitting a nonlinear model (e.g., logit or probit) with unit fixed effects encounters the incidental parameters problem: with short panels, the maximum likelihood estimator of the fixed effects is inconsistent because each unit's effect contains noise that does not vanish as the number of units grows. Solutions included conditional logit (which eliminates the fixed effects by conditioning on sufficient statistics) and correlated random effects (which paramaterizes the relationship between effects and regressors). This framework narrowed the original random effects approach by relaxing the linearity assumption but kept its focus on unit-specific heterogeneity; it coexists with the linear framework, each suited to different outcome types.
The Dynamic Panel Data Framework (1981) confronted a different limitation: when a lagged dependent variable appears as a regressor, the standard fixed effects estimator is biased in short panels (the Nickell bias). This bias arises because the demeaned lagged dependent variable is correlated with the demeaned error. The solution—first developed by Arellano and Bond—was to use internal instruments: lagged levels instrument for first differences, or vice versa. This framework transformed the fixed effects approach by explicitly modeling dynamics and providing consistent estimators when the number of time periods is small. It coexists with the linear fixed effects model: for long panels the bias is negligible, but for typical microeconomic panels the GMM approach is preferred. The dynamic framework also raised new questions about instrument validity and weak instruments, which later frameworks would address.
By the early 2000s, a broad shift in empirical economics toward design-based identification began to reshape panel methods. The Design-Based Panel Data Framework (2000) did not reject the earlier toolkits but repositioned them: rather than assuming that fixed effects alone purge all confounding, it emphasized that causal claims require a credible source of exogenous variation—a natural experiment, a policy discontinuity, or a comparison group. Difference-in-differences (DiD), event studies, and synthetic control methods now dominated the applied landscape. This framework challenged the automatic causal interpretation of a within-group estimate, pointing out that fixed effects only control for time-invariant confounders, not for time-varying shocks that differ across units. Methods like two-way fixed effects were shown to be estimands that combine multiple treatment effects, and new estimators (callaway-sant'anna, sun-abraham) were developed to handle staggered adoption. The design-based framework effectively subordinated the fixed effects approach for causal questions: fixed effects became a tool within a larger design, not a substitute for it. It coexists with the older frameworks—many applied studies use fixed effects as a baseline—but the conversation has shifted from modeling assumptions to the validity of the research design.
Two recent frameworks respond to demands that earlier methods could not meet. The Bayesian Panel Data Framework (2003) treats unit-specific heterogeneity not as a nuisance parameter to be eliminated but as a feature to be learned through hierarchical priors. Unlike the frequentist fixed/random effects dichotomy, which forces a binary choice between correlation and efficiency, Bayesian shrinkage estimators smoothly pool information across units. This approach is particularly valuable when the number of time periods is very small or when the researcher wants to model complex random coefficient structures. It coexists with the design-based framework: Bayesian methods are sometimes used within a design (e.g., Bayesian synthetic control) but remain a minority tradition because of computational demands and the dominance of frequentist inference in economics.
The High-Dimensional Panel Data Framework (2006) tackles a problem that earlier frameworks largely ignored: cross-sectional dependence. When many units are observed over time, they may be subject to common shocks (e.g., macroeconomic fluctuations, global trends) that violate the assumption of unit-specific independent errors. Conventional fixed effects or dynamic estimators can yield misleading inference in this setting. Solutions such as common correlated effects (Pesaran) and interactive fixed effects (Bai) model the dependence through latent factors. This framework complements earlier ones: for macro panels (countries, industries) it is now standard, while for micro panels with many random units it is less critical. It also connects to time-series econometrics and factor models.
Today, applied researchers choose among these frameworks based on the structure of their data and the nature of their research question. There is broad agreement that unobserved heterogeneity must be addressed, that dynamics require special care, and that cross-sectional dependence can be dangerous. But disagreements persist. The design-based camp argues that even the best model cannot substitute for a plausible experiment; the modeling camp cautions that design-based methods often rely on strong parallel trends assumptions that are themselves untestable. The Bayesian and frequentist traditions maintain their long-standing divergence over the role of prior information and the interpretation of uncertainty. Within dynamic panels, the choice between system GMM and long-difference estimators remains unsettled. And the high-dimensional framework, while accepted for macro data, has not displaced the standard fixed effects estimator in microeconomic applications where time periods are few and units are many. The result is a pluralistic field: researchers select methods not because one is uniformly better, but because each framework illuminates a different angle of the data, and the best practice is to check robustness across several.