Accuracy of Longitudinal AI based EZ Loss Quantification on OCT in Geographic Atrophy

April 13, 2026

http://report.ophth.wisc.edu/wp-content/uploads/sites/16/2026/04/Storm_ARVO_2026.png

Shelby Storm, Madeline Pflasterer-Jennerjohn, Robert Slater, Rachel Linderman, Jeong W. Pak, David Lopez, Barbara A. Blodi, Amitha Domalpally

Abstract

Purpose: Ellipsoid zone (EZ) loss is increasingly used as an endpoint in geographic atrophy (GA) trials, and AI algorithms are usually applied for automated EZ-loss area measurement. Most AI validations use cross-sectional datasets; however, therapeutic endpoints require accurate measurement of change. This study evaluates the performance of an AI model in quantifying longitudinal EZ-loss progression compared with human-graded change.

Methods: We conducted retrospective analysis of longitudinal OCT imaging in patients with GA previously evaluated at the Wisconsin Reading Center (GSK 341). A total of 201 eyes (45 participants) with baseline, 6-month (79 eyes) and 12-month (42 eyes) follow-up OCTs were included. A previously trained and validated WRC AI model was applied to all scans. EZ-loss area from AI predictions was compared with human-graded measurements, with both based on edge-detection methodology. Agreement was evaluated using mean EZ-loss area at each visit, longitudinal change from baseline, and Dice coefficient to assess spatial overlap.

Results: Across all visits, mean EZ-loss area was 8.52 mm2 (SD 5.12) by human and 9.09 mm2 (SD 5.02) with AI (p=0.00). At baseline, means were 7.66 mm2 (SD 4.62) vs 8.21 mm2 (SD 4.56); at 6 months 9.02 mm2 (SD 5.59) vs 9.50 mm2 (SD 5.26); and at 12 months 9.20 mm2 (SD 4.98) vs 9.98 mm2 (SD 5.25) respectively.
Mean EZ-loss progression from baseline to 6 months was 1.27 mm2 (SD 2.57) vs 1.19 mm2 (SD 2.42) (p=0.66), and from baseline to 12 months 1.47 mm2 (SD 2.38) vs 1.56 mm2 (SD 2.68) (p=0.63). The average Dice coefficient between human and AI segmentations across all visits was 0.83.

Conclusions: It is feasible to use an AI model to predict EZ-loss that is comparable to human based EZ-loss. The mean change in EZ loss at 6 months and 1 year showed no significant difference between AI and human segmentation. Future work will examine AI segmentation errors in detail to guide targeted refinement and develop deployment pipelines.