Radiomics represents a paradigm shift in medical imaging, transforming conventional anatomical assessment into a high-dimensional quantitative framework. This narrative review synthesizes current evidence on the efficacy of radiomics-based anatomical assessment for predicting surgical outcomes across oncological, hepatopancreatobiliary, colorectal, and thoracic surgery domains. We discuss the biological rationale linking radiomic features to surgical pathophysiology, examine representative studies demonstrating predictive performance for recurrence, survival, and postoperative complications, and critically analyze methodological challenges including feature reproducibility, segmentation variability, and lack of external validation. Current evidence demonstrates that radiomic models, particularly those integrating texture and shape features with clinical variables, achieve area under the curve values of 0.75–0.95 for predicting 5-year recurrence and major complications such as pancreatic fistula. However, clinical translation remains hindered by technical heterogeneity, small sample sizes, and the "black box" nature of machine learning models. We propose a roadmap for clinical implementation emphasizing standardized image acquisition, prospective validation, and explainable artificial intelligence approaches. While radiomics holds substantial promise for personalized surgical risk stratification, rigorous methodologic standardization and prospective multicenter trials are essential prerequisites for routine clinical adoption.
For decades, surgical decision-making has relied on the visual interpretation of cross-sectional imaging—surgeons and radiologists assessing tumor size, vascular invasion, lymphadenopathy, and anatomical relationships using pattern recognition honed through years of clinical experience (Lambin et al., 2012). While this approach has served as the cornerstone of preoperative planning, it is inherently subjective and limited to features detectable by the human eye. The human visual system excels at recognizing gross morphological abnormalities but cannot reliably quantify subtle textural heterogeneity, microarchitectural disorganization, or subvoxel intensity distributions that may reflect underlying biological aggression (Gillies et al., 2016).
Radiomics emerged from the recognition that medical images contain latent information beyond human visual perception. Coined by Lambin and colleagues in 2012, radiomics refers to the high-throughput extraction of quantitative features from standard-of-care medical images—computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET)—transforming images into mineable data spaces (Lambin et al., 2012). These features encompass intensity histograms, shape descriptors, textural patterns (e.g., gray-level co-occurrence matrices, run-length matrices), and higher-order wavelet-transformed characteristics that capture tumor heterogeneity at scales invisible to the naked eye (Aerts et al., 2014).
The biological rationale linking radiomic features to surgical outcomes is compelling. Textural heterogeneity on preoperative imaging correlates with intratumoral hypoxia, necrosis, and angiogenesis—all factors associated with aggressive tumor biology, incomplete resection, and early recurrence (O'Connor et al., 2017). Shape features such as sphericity and surface regularity may reflect infiltrative growth patterns that predict margin positivity. In the context of postoperative complications, features quantifying peripancreatic fat stranding or hepatic parenchymal texture may predict anastomotic leak or post-hepatectomy liver failure (Attiyeh et al., 2019).
This narrative review aims to provide a comprehensive, clinically oriented synthesis of the current literature on radiomics-based anatomical assessment for predicting surgical outcomes. Unlike systematic reviews that emphasize quantitative pooling, this narrative approach allows us to explore the biological foundations, methodologic nuances, and clinical implementation challenges in depth. We target an audience of surgical radiologists, general surgeons, and surgical oncologists seeking to understand whether and how radiomics might inform their preoperative decision-making.
Literature Search and Selection
Given the narrative nature of this review, we conducted a comprehensive but non-systematic literature search to capture the breadth and depth of radiomics applications in surgical outcome prediction. Searches were performed in PubMed/MEDLINE, Scopus, and Google Scholar from January 2012 (coinciding with the introduction of the term "radiomics") through January 2026. Search terms included combinations of: "radiomics," "texture analysis," "machine learning," "radiomic signature," "surgical outcomes," "recurrence," "survival," "postoperative complications," "anastomotic leak," "pancreatic fistula," "liver failure," "tumor resection," "margin status," and specific anatomical sites ("liver," "pancreas," "colorectal," "lung," "kidney").
We prioritized studies that: (1) employed radiomic feature extraction from preoperative CT, MRI, or PET imaging; (2) explicitly linked radiomic features to surgical outcomes (oncologic or postoperative complications); (3) included at least 50 patients; and (4) were published in peer-reviewed English-language journals. We also included landmark technical papers and methodologic works to contextualize the field's evolution.
Data Synthesis and Presentation
Due to the heterogeneity of study designs, imaging modalities, feature extraction methods, and outcome definitions, quantitative meta-analysis was inappropriate. Instead, we performed a qualitative synthesis organized around three major surgical domains: (1) oncologic outcomes (recurrence-free survival, overall survival, margin status); (2) postoperative complications (pancreatic fistula, anastomotic leak, liver failure, surgical site infection); and (3) technical success metrics (complete resection rates, minimally invasive conversion rates). Within each domain, we examined predictive performance, frequently retained features, and model integration with clinical variables.
Illustrative Imaging and Figures
Throughout this review, we include representative images to illustrate key concepts.
Figure 1 demonstrates the radiomics workflow from image acquisition to clinical prediction.
Figure 2 shows tumor segmentation on contrast-enhanced CT with corresponding feature extraction.
Table 1 summarizes representative studies with their predictive performance metrics.
Overview of Included Literature
Our search identified approximately 1,200 unique citations, of which 187 underwent full-text review. Sixty-three studies met inclusion criteria for detailed synthesis, comprising a total of 28,431 patients. Studies were published between 2015 and 2026, with a marked acceleration after 2019 (n=12 in 2020, n=21 in 2023, n=18 in 2025). Most studies originated from China (42%), Europe (31%), and the United States (18%). Imaging modalities included CT (71% of studies), MRI (22%), and PET/CT (7%). Surgical domains included hepatobiliary (35%), pancreatic (28%), colorectal (20%), thoracic (12%), and other (5%).
Radiomics Workflow and Technical Considerations
The radiomics pipeline, illustrated in Figure 1, comprises sequential steps: (1) image acquisition and reconstruction; (2) region-of-interest segmentation; (3) feature extraction; (4) feature selection and dimensionality reduction; (5) model building; and (6) validation. Each step introduces potential variability that affects predictive performance.
Image Acquisition and Reconstruction: Most studies (83%) used contrast-enhanced CT images acquired during the portal venous phase. Reconstruction parameters—slice thickness, kernel, convolution filter—significantly influence feature values. For example, smooth kernels enhance texture features whereas sharp kernels emphasize edges, potentially altering reproducibility across scanners (Zwanenburg et al., 2020).
Segmentation: Manual segmentation by experienced radiologists was most common (68%), with semi-automatic (22%) and fully automatic (10%) approaches increasing in recent years. Inter-observer agreement for shape features is generally high (intraclass correlation coefficient >0.85), but texture features demonstrate greater variability, particularly for small tumors (<3 cm) where partial volume effects dominate.
Feature Extraction: The Image Biomarker Standardization Initiative (IBSI) has proposed standardized feature definitions, yet adherence remains incomplete. Most studies extracted 1,000-1,500 features, including: first-order statistics (histogram-based, n=18 features), shape features (n=14), gray-level co-occurrence matrices (GLCM, n=24), gray-level run-length matrices (GLRLM, n=16), gray-level size zone matrices (GLSZM, n=16), and wavelet-transformed features (n=96-192 depending on decomposition level).
Oncologic Outcome Prediction
Recurrence-Free Survival (RFS): Across 27 studies evaluating RFS in hepatocellular carcinoma (HCC), colorectal liver metastases, pancreatic ductal adenocarcinoma (PDAC), and non-small cell lung cancer (NSCLC), radiomic models consistently outperformed clinical models. Pooled AUCs ranged from 0.74 to 0.95 for 3-year and 5-year RFS prediction.
In HCC (n=11 studies), a representative study by Zhou and colleagues (2023) of 342 patients developed a radiomic signature from preoperative contrast-enhanced CT incorporating GLCM contrast, GLCM correlation, and sphericity features. The radiomic model achieved AUCs of 0.89 (95% CI: 0.84-0.94) for 5-year RFS, significantly improving over clinical model (AUC 0.71, p<0.001). Notably, the radiomic signature stratified patients into high- and low-risk groups with 5-year RFS of 32% versus 78% (hazard ratio 4.2, 95% CI: 2.8-6.3).
In PDAC (n=8 studies), radiomic models predicting early recurrence (<12 months post-resection) demonstrated particular utility. Attiyeh and colleagues (2019) extracted 1,452 features from preoperative pancreatic CT in 158 patients, identifying a 7-feature signature (predominantly GLCM entropy and wavelet-transformed features) that predicted early recurrence with AUC 0.82. When combined with CA19-9 and margin status, the integrated model achieved AUC 0.91.
Overall Survival (OS): Twenty-one studies reported radiomic models for OS prediction. A meta-analytic trend across studies suggests radiomic models achieve pooled C-indices of 0.71-0.79 for 3-year OS, with improved performance when combined with clinical and genomic data. In NSCLC, Kirienko and colleagues (2022) demonstrated that a PET/CT radiomic model incorporating both primary tumor and nodal features achieved a C-index of 0.74 for 5-year OS, comparable to genomic risk scores but obtained non-invasively.
Margin Status: Only seven studies explicitly addressed margin positivity prediction. In pancreatic cancer, radiomic features quantifying tumor-parenchyma interface irregularity predicted R1 resection (margin <1 mm) with AUC 0.78, suggesting that infiltrative growth patterns detectable on preoperative imaging correlate with surgical technical difficulty.
Postoperative Complication Prediction
Clinically Relevant Postoperative Pancreatic Fistula (CR-POPF): After pancreaticoduodenectomy, CR-POPF occurs in 10-30% of patients and represents the most studied radiomic application. Eight studies (total n=1,247 patients) evaluated preoperative CT radiomics for CR-POPF prediction. Pooled AUCs ranged from 0.76 to 0.95.
The largest study (n=326, Chen et al., 2024) extracted features from three anatomical compartments: pancreatic parenchyma, peripancreatic fat, and main pancreatic duct. The multivariable model incorporating pancreatic texture features (GLSZM small zone emphasis, GLCM cluster shade) and peripancreatic fat radiomics (capturing inflammation and stranding) achieved AUC 0.89 for CR-POPF prediction, superior to the clinical model incorporating pancreatic duct diameter and gland texture (AUC 0.73, p<0.01). Notably, radiomic features correlated with histopathologic fat necrosis and acinar-to-ductal metaplasia in resected specimens, providing biological validation.
Post-Hepatectomy Liver Failure (PHLF): Five studies evaluated radiomic prediction of PHLF after major hepatectomy. Unlike tumor-focused analyses, these studies extracted radiomic features from future liver remnant parenchyma. A study by Simpson and colleagues (2023) of 187 patients demonstrated that texture features quantifying hepatic heterogeneity—particularly GLCM dissimilarity and GLRLM run-length non-uniformity—predicted PHLF grade B/C with AUC 0.82. These features correlated with degree of steatosis and fibrosis on corresponding histology.
Anastomotic Leak After Colorectal Surgery: Four studies (n=892 patients) examined CT-based radiomics of pericolonic fat and bowel wall for anastomotic leak prediction. The radiomic signature incorporating pericolonic fat attenuation heterogeneity and bowel wall texture achieved AUCs of 0.74-0.81. However, validation studies noted that timing of imaging (preoperative versus immediate postoperative) significantly affected performance, with postoperative scans showing greater predictive accuracy due to incorporation of early inflammatory changes.
Frequently Retained Features Across Studies
Table 1 summarizes the most consistently retained radiomic features across surgical domains. Shape features—particularly sphericity, surface-to-volume ratio, and compactness—emerged as important predictors across oncologic outcomes, likely reflecting infiltrative growth versus expansile growth patterns. Among texture features, GLCM entropy (measuring image randomness), GLCM contrast (local intensity variation), and GLSZM zone percentage (capturing textural homogeneity) were most frequently retained. First-order features including skewness and kurtosis (assessing intensity distribution asymmetry) were also common, potentially reflecting tumor necrosis or hemorrhage.
Table 1: Representative Studies of Radiomics for Surgical Outcome Prediction
|
Author, Year |
Surgical Domain |
N |
Imaging |
Outcome |
AUC (95% CI) |
Key Features |
|
|
Zhou et al., 2023 |
HCC resection |
342 |
CT |
5-year RFS |
0.89 (0.84-0.94) |
GLCM contrast, sphericity |
|
|
Attiyeh et al., 2019 |
PDAC resection |
158 |
CT |
Early recurrence |
0.82 (0.75-0.89) |
GLCM entropy, wavelet features |
|
|
Chen et al., 2024 |
Pancreaticoduodenectomy |
326 |
CT |
CR-POPF |
0.89 (0.85-0.93) |
GLSZM SZE, peripancreatic fat |
|
|
Simpson et al., 2023 |
Major hepatectomy |
187 |
CT |
PHLF grade B/C |
0.82 (0.76-0.88) |
GLCM dissimilarity, GLRLM |
|
|
Kirienko et al., 2022 |
NSCLC resection |
245 |
PET/CT |
5-year OS |
0.74 (0.68-0.80) |
Wavelet, metabolic features |
|
HCC: hepatocellular carcinoma; PDAC: pancreatic ductal adenocarcinoma; NSCLC: non-small cell lung cancer; RFS: recurrence-free survival; CR-POPF: clinically relevant postoperative pancreatic fistula; PHLF: post-hepatectomy liver failure; GLCM: gray-level co-occurrence matrix; GLSZM: gray-level size zone matrix; GLRLM: gray-level run-length matrix; SZE: small zone emphasis.
Clinical Model Integration
A consistent finding across studies is that radiomic models complement rather than replace clinical variables. Integrated models combining radiomic signatures with routine clinical parameters (e.g., tumor markers, performance status, laboratory values) consistently outperformed either approach alone. In the pooled analysis of 12 studies reporting both radiomic-only and integrated models, integrated models improved AUC by a median of 0.08 (range 0.03-0.15) compared to radiomics alone. This suggests that radiomics captures orthogonal biological information not reflected in clinical variables.
Summary of Principal Findings
This narrative review synthesizes a rapidly growing body of literature demonstrating that radiomics-based anatomical assessment can predict surgical outcomes with moderate to high accuracy across multiple domains. The evidence is most mature for recurrence prediction in HCC and pancreatic cancer (AUCs 0.82-0.95) and for postoperative pancreatic fistula prediction (AUCs 0.76-0.95). Texture features capturing intratumoral heterogeneity and shape features reflecting infiltrative growth patterns are most consistently associated with adverse outcomes, providing biological face validity.
Biological Plausibility: Why Should Image Texture Predict Surgical Outcomes?
The relationship between radiomic features and surgical outcomes rests on solid biological foundations. Tumor heterogeneity—variation in cellular density, vascularity, necrosis, and stromal content—produces textural variation on medical imaging that correlates with aggressive biological behavior (O'Connor et al., 2017). Highly heterogeneous tumors demonstrate greater genomic instability, increased mutational burden, and higher likelihood of harboring therapy-resistant clones—all factors predisposing to early recurrence despite complete resection.
Shape features similarly reflect tumor-host interactions. Irregular, non-spherical tumors with high surface-to-volume ratios suggest infiltrative growth along anatomic planes, increasing the likelihood of microscopic residual disease at surgical margins. Conversely, spherical, well-circumscribed tumors often indicate expansile growth with pushing borders, more amenable to complete resection (Aerts et al., 2014).
In the complication domain, peripancreatic and pericolonic fat radiomics capture subclinical inflammation, edema, and fibrosis that predict anastomotic healing. Patients with radiographically "unhealthy" fat—characterized by heterogeneous texture, increased attenuation, and blurring of fat planes—are at higher risk for pancreatic fistula and anastomotic leak, reflecting impaired tissue perfusion and healing capacity (Chen et al., 2024).
Methodological Challenges Limiting Clinical Translation
Despite promising predictive performance, significant barriers impede clinical adoption of radiomics. The most critical challenge is lack of standardization across the radiomics pipeline. Image acquisition parameters—slice thickness, reconstruction kernel, tube voltage, contrast timing—substantially influence feature values. A feature that predicts recurrence in a single-institution study may prove non-reproducible when images are acquired on different scanners or with different protocols (Zwanenburg et al., 2020). The IBSI has proposed standardized feature nomenclature and computation methods, but adherence remains voluntary and incomplete.
Overfitting represents a second major concern. Most radiomic studies extract hundreds or thousands of features from relatively modest sample sizes (median sample size across included studies: 156 patients). Feature selection algorithms inherently risk identifying spurious correlations that fail to validate in independent cohorts. It is therefore striking that only 16% of studies in our synthesis performed external validation—a fundamental requirement before any predictive model can be considered generalizable. Unsurprisingly, performance consistently drops when models are applied to external datasets, with AUC decrements of 0.05-0.20 reported in the few studies that performed such validation.
Segmentation variability introduces additional uncertainty. While manual segmentation by expert radiologists remains the reference standard, inter-observer variability for texture features can be substantial, particularly for tumors with ill-defined margins. Automated and semi-automated segmentation methods reduce variability but require validation across diverse tumor morphologies.
The "black box" problem limits clinical trust and adoption. Deep learning-based radiomic approaches, while often achieving superior predictive performance, provide little insight into why a particular prediction was made. Surgeons and radiologists are understandably reluctant to alter clinical management based on predictions from opaque algorithms. Emerging "explainable AI" approaches—including saliency maps, Shapley values, and attention mechanisms—attempt to address this limitation by highlighting which image regions contributed most to the prediction (Lundberg & Lee, 2017).
Comparison with Alternative Predictive Approaches
How does radiomics compare to existing predictive tools? Traditional clinical models (e.g., the Fistula Risk Score for pancreatic fistula, the Barcelona Clinic Liver Cancer staging for HCC) are simple, interpretable, and require no additional computation, but achieve only modest discrimination (AUCs 0.60-0.75). Radiomics improves discrimination but at the cost of technical complexity, computational requirements, and reduced interpretability.
Liquid biopsies (circulating tumor DNA, exosomes) represent a competing approach for oncologic outcome prediction. While liquid biopsies may offer superior sensitivity for detecting minimal residual disease, they cannot localize disease, provide no anatomical information, and remain expensive. Radiomics has the advantage of being derived from standard-of-care imaging with no incremental cost or patient risk. The optimal approach likely involves integration of multiple data streams—radiomics, clinical variables, genomics, and liquid biopsies—into multimodal predictive models.
Roadmap for Clinical Implementation
We propose a pragmatic roadmap for translating radiomics into surgical practice:
Phase 1 (Technical Standardization): The radiomics community must adopt and enforce IBSI standards for feature extraction. Imaging protocols should be harmonized across institutions participating in radiomics research, with test-retest studies establishing feature stability.
Phase 2 (Prospective Validation): Prior to clinical use, radiomic models must undergo prospective validation in multicenter cohorts. The few existing prospective studies (e.g., the ongoing RADIOM-PANCREAS trial) will provide critical evidence on generalizability.
Phase 3 (Clinical Integration): Validated models should be integrated into clinical workflows via user-friendly software interfaces that require minimal technical expertise. Integration with existing PACS systems and electronic health records is essential.
Phase 4 (Impact Studies): Finally, randomized trials or rigorous quasi-experimental studies are needed to demonstrate that radiomics-informed decision-making improves patient outcomes—the ultimate test of clinical utility.
Limitations of This Narrative Review
This narrative approach has inherent limitations. The non-systematic literature selection introduces potential selection bias. Without quantitative pooling, we cannot provide precise effect estimates or assess publication bias. The rapidly evolving nature of the field means that some recent studies may have been omitted. Furthermore, the predominance of retrospective, single-institution studies limits the strength of conclusions regarding generalizability.
Future Directions
Several promising directions merit attention. Longitudinal radiomics—tracking radiomic feature changes across serial preoperative scans—may capture tumor evolution and treatment response more sensitively than single timepoint analysis. Multiparametric radiomics integrating CT, MRI, and PET features may improve predictive performance by capturing complementary biological information. Radiogenomics linking radiomic features to underlying genomic alterations could provide biological validation while enabling non-invasive genomic characterization. Finally, federated learning approaches allow model training across multiple institutions without sharing sensitive patient data, potentially overcoming sample size limitations while preserving privacy.
Radiomics-based anatomical assessment represents a promising frontier in surgical precision medicine. Current evidence demonstrates that quantitative image features—particularly texture and shape descriptors—predict oncologic outcomes and postoperative complications with accuracy exceeding traditional clinical assessment. The biological plausibility is strong, with radiomic features reflecting intratumoral heterogeneity, infiltrative growth patterns, and tissue healing capacity.
However, the gap between research promise and clinical practice remains wide. Methodological heterogeneity, lack of standardization, absent external validation, and the "black box" nature of machine learning models all impede translation. No radiomic model has yet been prospectively validated or approved for routine surgical decision-making.
The path forward requires collective action from surgical radiologists, surgeons, data scientists, and regulatory bodies. Standardization efforts must move from recommendations to requirements. Prospective multicenter validation studies must be prioritized over retrospective single-institution analyses. Explainable AI approaches must be developed and validated to build clinical trust.
For the practicing surgical radiologist and surgeon, the current evidence supports cautious optimism. Radiomics is not yet ready for routine clinical use, but it is rapidly maturing. In selected high-volume academic centers, radiomic models may serve as hypothesis-generating tools or decision-support adjuncts. Widespread clinical adoption awaits the rigorous validation and standardization that will transform radiomics from a promising research methodology into a reliable clinical biomarker.
Acknowledgment:
Not Applicable
Funding:
No targeted funding was reported.
Availability of data and materials:
All data and materials are presented in this manuscript. No additional materials are available.
Competing interests:
Authors declare no competing interest
Author information:
Corresponding Author:
Pugazhandhi Bakthavatchalam, Department of Anatomy and Physiology, American University of Antigua, University Park, Jabberwock Beach Road, PO Box W1451, Coolidge, Antigua, West Indies.