A radiologist looks at a CT scan and sees a lung nodule. But how does that visual finding become a structured report, a prediction of malignancy, a recommendation for follow-up, and eventually a data point that improves care for the next patient? That chain—from pixel to decision to learning—is the core problem of imaging informatics. The subfield asks a deceptively simple question: how should medical image data be structured, interpreted, and fed back into clinical practice? Over the past four decades, four major frameworks have shaped the answer, each building on, reacting to, or coexisting with the others.
The first systematic attempt to turn image findings into actionable knowledge was rule-based decision support. In the 1980s, researchers began encoding radiological expertise as explicit if–then rules. A system might contain rules such as “if a mammographic mass has spiculated margins and is high density, then suggest biopsy.” These systems were designed to assist radiologists by reducing oversight errors and standardizing interpretations. The most visible application was computer-aided detection (CAD) for mammography, which used hand-crafted features and decision thresholds to mark suspicious regions.
Rule-based systems had a clear strength: their reasoning was transparent. A clinician could see exactly why a system flagged a finding. But they were also brittle. Rules that worked in one institution often failed in another because imaging protocols, patient populations, and equipment varied. Maintaining and updating the rule sets was labor-intensive, and the systems could not easily adapt to new knowledge or rare patterns. By the late 1990s, the limitations of hand-crafted logic had become a driving pressure for a different approach.
Rather than prescribing rules, the data-driven framework lets algorithms learn patterns directly from large collections of images and their associated outcomes. Starting in the 1990s with statistical classifiers and later exploding with deep learning after 2012, data-driven prediction shifted the focus from expert knowledge to empirical correlation. A convolutional neural network trained on thousands of chest X-rays can detect pneumonia with accuracy rivaling radiologists—without ever being told what pneumonia “looks like” in anatomical terms.
Data-driven prediction largely replaced rule-based decision support for tasks like lesion detection, segmentation, and triage. But it did not eliminate the need for structure. Training requires massive, well-labeled datasets, and the models themselves are often black boxes, making it difficult to trust or debug their decisions. Moreover, data-driven methods are only as good as the data they are trained on; biases in the training set can lead to systematic errors. These limitations kept the door open for a third framework that could provide the semantic scaffolding that pure pattern recognition lacks.
At roughly the same time that data-driven methods were gaining traction, a parallel effort emerged to standardize the language of radiology. Knowledge representation frameworks—ontologies, controlled vocabularies, and structured reporting templates—aim to capture the meaning of image findings in a machine-readable form. RadLex, the Radiology Lexicon, provides a unified terminology for radiology reports. The Annotation and Image Markup (AIM) model allows image observations to be linked to ontology terms. These resources enable different systems to exchange and reason about image data without ambiguity.
Knowledge representation is not a competitor to data-driven prediction; it is an infrastructure that both rule-based and data-driven systems rely on. A deep learning model that outputs “nodule” is more useful if that output is mapped to a standard term like RadLex RID3875. A rule-based system that triggers an alert for “spiculated mass” depends on a consistent definition of that concept. Knowledge representation also supports interoperability across institutions, which is essential for the large-scale datasets that data-driven methods require. In this sense, it acts as a bridge between the raw pixel world and the clinical decision world.
The learning health system (LHS) framework, articulated prominently by the Institute of Medicine in the 2000s, takes a system-level view. It proposes that every clinical encounter should generate data that is captured, analyzed, and fed back to improve future care. In imaging informatics, this means linking image interpretations, patient outcomes, and decision support into a continuous cycle. A learning health system might use data-driven prediction to recommend follow-up intervals, knowledge representation to ensure that findings are coded consistently, and then track whether those recommendations actually improve survival or reduce unnecessary biopsies.
The LHS framework does not replace the earlier approaches; it absorbs and integrates them. Rule-based and data-driven tools become components within a larger feedback loop. Knowledge representation provides the semantic glue that allows data from different sources to be combined. The LHS also introduces a new emphasis on governance, privacy, and the ethical use of data, because the system learns from every patient’s information. It transforms imaging informatics from a collection of isolated algorithms into an organizational capability.
Today, three of the four frameworks remain active. Rule-based decision support has largely been superseded for primary interpretation, though it persists in some regulatory-approved CAD systems and in clinical decision support rules embedded in electronic health records. Data-driven prediction, knowledge representation, and the learning health system coexist, each with a distinct role.
Data-driven prediction is the engine of innovation: it powers the most impressive demonstrations of AI in radiology, from fracture detection on X-rays to tumor segmentation on MRI. Knowledge representation is the infrastructure that makes those predictions actionable and shareable: without standard terminologies, a model trained at one hospital cannot communicate its findings to another hospital’s system. The learning health system is the organizational vision that ties everything together: it asks not just “can we predict?” but “are we improving care?”
These frameworks agree on several fundamentals: that image data must be structured to be useful, that feedback from outcomes is essential, and that no single method is sufficient. But they also disagree on priorities. Proponents of data-driven methods often argue that with enough data, explicit knowledge representation becomes unnecessary—deep learning can learn its own features and even its own “concepts.” Knowledge representation advocates counter that without ontologies, models remain opaque and cannot be trusted in high-stakes decisions. The learning health system perspective insists that both are needed, but that the real challenge is organizational: building the infrastructure and culture to close the loop.
Imaging informatics has not settled on a single framework because the problem it addresses is multifaceted. Rule-based systems taught the field the value of transparency and the cost of brittleness. Data-driven methods showed the power of learning from data but also the dangers of bias and opacity. Knowledge representation provided the semantic backbone that makes integration possible. The learning health system gave the field a purpose: not just to interpret images, but to improve health. The tension between these frameworks is not a sign of immaturity; it is the engine that drives the subfield forward, forcing researchers and clinicians to ask harder questions about what it means to turn a pixel into a better outcome.