Health policy evaluation has always been a contest over whose evidence counts. A government deciding whether to fund a new cancer drug, a health system choosing between two diabetes programs, or a legislature setting quality standards for hospitals—each decision requires a framework that defines what counts as a valid outcome, what methods produce trustworthy evidence, and whose perspective should carry weight. Over the past six decades, five major frameworks have shaped how evaluators answer those questions. Each emerged in response to the limitations of its predecessors, and each remains active today, creating a pluralistic landscape where the choice of framework is itself a policy decision.
The first systematic framework for health policy evaluation was built on the logic of economics. In the 1960s and 1970s, as health care costs rose and governments began to take on greater responsibility for funding care, policymakers needed a way to compare the value of very different interventions—a heart transplant versus a vaccination program, for instance. The Economic Evaluation Paradigm answered that need by importing tools from welfare economics: cost-effectiveness analysis, cost-utility analysis, and cost-benefit analysis. Its central metric became the quality-adjusted life year (QALY), which combined length of life with a preference-based measure of health-related quality of life into a single number. By calculating the cost per QALY gained, evaluators could rank interventions on a common scale and identify which ones offered the greatest health return per dollar spent.
This framework was a genuine breakthrough. For the first time, health policy evaluation had a transparent, quantitative method that could be applied across diseases, treatments, and populations. Its influence spread rapidly through academic health economics and into government agencies, particularly in the United Kingdom, Canada, and Australia. Yet the Economic Evaluation Paradigm also carried a built-in limitation: it defined value almost entirely in terms of aggregate health gain, measured by a single metric. It had little room for how patients themselves experienced care, for equity across population groups, or for outcomes that could not be captured in a QALY. Those gaps would drive the next wave of frameworks.
Health Technology Assessment (HTA) emerged in the 1980s as a direct expansion of the economic paradigm. Where the Economic Evaluation Paradigm had focused narrowly on costs and health outcomes, HTA insisted that a full evaluation must also consider clinical effectiveness, safety, ethical implications, social consequences, and organizational feasibility. A new drug might be cost-effective by QALY standards but raise serious equity concerns if it was too expensive for public systems to distribute fairly; HTA made those dimensions visible.
HTA did not replace economic evaluation so much as absorb it into a broader framework. The cost-per-QALY analysis remained a core component, but it was now embedded in a multi-criteria decision process that required evaluators to weigh evidence from clinical trials, epidemiological studies, patient registries, and sometimes qualitative research. HTA also became deeply institutionalized. National agencies such as the UK's National Institute for Health and Care Excellence (NICE), Germany's Institute for Quality and Efficiency in Health Care (IQWiG), and the Canadian Agency for Drugs and Technologies in Health (CADTH) turned HTA into the official language of coverage and reimbursement decisions. By the 1990s, HTA had become the dominant framework for health policy evaluation in many high-income countries, and it remains a major force today, especially in regulatory and reimbursement contexts.
Just as HTA was becoming institutionalized, a different movement gathered force under the banner of Evidence-Based Policy (EBP). Drawing on the principles of evidence-based medicine, EBP argued that policy decisions should be grounded in the highest-quality research evidence, defined by a strict hierarchy: systematic reviews of randomized controlled trials (RCTs) at the top, followed by individual RCTs, then observational studies, and finally expert opinion. For EBP advocates, the problem with HTA was that it was too permissive. HTA's willingness to incorporate diverse evidence types—including modeling studies, expert panels, and observational data—meant that weaker evidence could be used to justify decisions that an RCT-based standard would not support.
This created a lasting tension. HTA practitioners countered that policy decisions could not wait for the perfect RCT; they had to be made with the best available evidence, even if that evidence was imperfect. Moreover, many policy-relevant questions—such as the long-term effects of a population-level screening program—could not be answered by RCTs at all. The debate between EBP and HTA was not a simple disagreement about methods; it reflected deeper differences about the role of certainty in policy. EBP demanded a high bar for evidence before adopting an intervention, while HTA accepted a wider range of evidence in order to make timely decisions. Both frameworks remain active, and their coexistence has pushed evaluators to become more explicit about the quality and limitations of the evidence they use.
By the early 2000s, a different kind of challenge was emerging. Both the Economic Evaluation Paradigm and HTA had defined outcomes from the perspective of the health system or the payer. Even when they considered patient experience, it was typically captured through standardized instruments like the EQ-5D or the SF-36, which might not reflect what patients themselves considered most important. Patient-Centered Outcomes Research (PCOR) turned this logic upside down. It argued that the outcomes that matter most are the ones that patients identify, not the ones that researchers or policymakers assume. This meant engaging patients directly in setting research priorities, selecting outcome measures, and interpreting results.
PCOR's rise was fueled by a combination of grassroots patient advocacy, academic critique, and institutional support—most notably the creation of the Patient-Centered Outcomes Research Institute (PCORI) in the United States in 2010. PCORI explicitly funded comparative effectiveness research that involved patients as partners, not just as subjects. The framework also challenged the dominance of the QALY. If a treatment improved a patient's ability to work or reduced anxiety about a chronic condition, those benefits might not show up in a QALY calculation, but they were real and meaningful. PCOR thus broadened the scope of evaluation in a different direction than HTA had: HTA had added more dimensions of evidence (ethical, social, organizational), while PCOR changed who gets to define what counts as a relevant dimension in the first place.
The most recent framework, the Learning Health System (LHS), represents a structural transformation of the evaluation enterprise itself. Rather than treating evaluation as a separate activity that happens before a policy is adopted or after it is implemented, LHS envisions a system in which every clinical encounter generates data that can be analyzed to improve care in real time. Evaluation becomes continuous, embedded, and iterative. The LHS framework draws on the infrastructure of electronic health records, data analytics, and pragmatic clinical trials to create a feedback loop: data from routine care is used to generate evidence, which is then applied to change practice, and the effects of those changes are measured and fed back into the system.
LHS does not replace the earlier frameworks so much as integrate them into a new operational model. Economic evaluation, HTA, and PCOR all contribute to the learning cycle. A health system using an LHS approach might conduct a rapid cost-effectiveness analysis using its own patient data, incorporate patient-reported outcomes collected through a mobile app, and adjust clinical guidelines based on the results—all within months rather than years. The framework also inherits the tensions of its predecessors. The evidence standards of EBP can be difficult to maintain in a system that prioritizes speed and real-world data, and the patient engagement principles of PCOR require sustained investment in infrastructure and trust-building. LHS remains the youngest of the five frameworks, and its full implications for policy evaluation are still being worked out.
Today, all five frameworks are active, and no single one has achieved dominance. Their coexistence creates a complex division of labor. The Economic Evaluation Paradigm remains the default tool for cost-effectiveness analyses in academic research and for many reimbursement decisions, especially when a single metric is needed for comparison across interventions. HTA is the institutional backbone of coverage and formulary decisions in countries with centralized health technology agencies; it is best suited for comprehensive, multi-dimensional assessments of new technologies. EBP continues to shape the design of systematic reviews and clinical guidelines, particularly in contexts where the strength of evidence is the primary concern. PCOR has become the dominant framework for research that aims to inform patient decision-making and for studies funded by patient-centered organizations. LHS is increasingly influential in large integrated health systems and in discussions about how to use real-world data for continuous improvement.
Despite their differences, the frameworks share a growing consensus on several points. All agree that evaluation must be transparent about its methods and assumptions. All recognize that patient perspectives matter, even if they disagree about how to incorporate them. And all acknowledge that no single type of evidence is sufficient for policy decisions—economic, clinical, and experiential data all have a role. The major disagreements center on the hierarchy of evidence (EBP's insistence on RCT primacy versus HTA's pragmatic pluralism), the role of the QALY (still central for economic evaluation but contested by PCOR), and the speed of evaluation (LHS's push for real-time learning versus the slower, more cautious pace of traditional HTA and EBP). These are not signs of a fragmented field; they are the productive tensions of a mature discipline that has learned to ask harder questions about whose evidence counts and why.