UWF-Based Diabetic Retinopathy (DR) Prescreening Using a Novel Patch-Based AI Model

Nancy Barrett, Varsha Satish, Robert Slater, Thomas Saunders, Rachel Linderman, Barbara Blodi, Amitha Domalpally
Abstract
Purpose: Ultra-widefield (UWF) color photographs are widely used in clinical practice and clinical trials because they capture the entire retina with minimal patient burden. Trial eligibility includes DR severity, distinguishing moderately severe NPDR (ETDRS ≥47) from lower severity levels. However, UWF images are large and naive down-sampling can affect detection of small lesions such as microaneurysms and IRMA which define DR levels. We evaluated a patch-based, weakly supervised Multiple Instance Learning (MIL) architecture to preserve resolution while achieving DR severity level classification.
Methods: 835 UWF color images were included from multiple DR studies: 663 eyes were used for model development and 5-fold cross-validation while 172 eyes were held for independent testing. Eyes were categorized as ETDRS <47 (345 eyes) or ≥47 (318 eyes) by adjudicated double-read to provide ground-truth severity. Images were preprocessed by: cropping to the mid-periphery (removes lash artifacts and background) and fdivision into 224×224 patches (maintains resolution.)
A MIL Attention classifier was trained using image-level labels. Each patch was encoded using EfficientNet_B0 (feature extractor) to produce a 1280-dimensional feature vector. Patch features from each image were combined using an attention-based pooling module to assign higher weights to more informative regions. The resulting image-level feature representation was passed to a linear classifier to predict <47 versus ≥47. Five-fold cross-validation was performed. Saliency/attention maps were generated to visualize model focus and assess biological plausibility.
Results: Across 5-fold CV, mean out-of-fold performance was: Accuracy 0.69, AUROC 0.74, Sensitivity 0.52, Specificity 0.82 and F1 0.59. Performance on the independent test set was 0.61, 0.76, 0.87, 0.4, and 0.65 respectively. On the test set, the model correctly identified 87% of eligible eyes (65/75) and 40% of ineligible eyes (39/97). Attention maps indicated reasonable focus on patch regions containing abnormalities.
Conclusions: A weakly supervised patch-based MIL approach enables UWF image classification while preserving local lesion detail that is lost with global down-sampling. The method demonstrates feasibility in detecting fine lesions on large images. Further refinement such as hybrid global-context models may improve sensitivity for DR thresholds.