Machine learning econometrics is the subfield that adapts statistical learning methods for economic data analysis, balancing predictive flexibility with inferential rigor. Its history is shorter than that of traditional econometric schools but already features distinct rival frameworks. Early work in the 1990s applied neural networks and tree-based methods primarily for prediction, but faced criticism for lacking formal inference guarantees. The field crystallized around three major schools that each propose different answers to how machine learning should be integrated into econometric practice.
The first major school, High-Dimensional Econometrics, emerged in the late 2000s and early 2010s. Associated with Belloni, Chernozhukov, Hansen, and others, this school developed penalized regression methods such as LASSO and Ridge for variable selection in settings with many covariates. A key contribution was post-selection inference, enabling valid hypothesis testing after model selection. This school treats high-dimensionality as the core challenge and provides uniformly valid estimation and inference procedures.
A second school, Causal Machine Learning, arose around the same period, led by Athey, Imbens, and Chernozhukov. It integrates machine learning with causal inference, focusing on estimating heterogeneous treatment effects and average treatment effects in high-dimensional or complex settings. Key innovations include causal forests and double/debiased machine learning, which use cross-fitting and Neyman orthogonality to allow flexible ML methods while preserving valid inference. This school prioritizes causal identification over pure prediction.
More recently, Deep Learning Econometrics has emerged as a third school, applying neural networks, recurrent architectures, and transformers to economic time series, structural estimation, and causal modeling. Bayesian machine learning methods, including Gaussian processes and Bayesian additive regression trees, also form a distinct approach that bridges with Bayesian Econometrics. These schools continue to evolve, with ongoing debates about interpretability, computational feasibility, and the trade-off between flexibility and inferential credibility. Machine learning econometrics thus represents a paradigm shift from parametric models to algorithmic learning, while maintaining the discipline's commitment to rigorous empirical analysis.