AI-based Multilayer OCT Segmentation Model For Macular Edema

Justin Bitner, Rachel Linderman, Aadhi Balasubramanian, Madeline Pflasterer-Jennerjohn, Jeong W. Pak, Robert Slater, Roomasa Channa, Lucas Maakested, Barbara A. Blodi, Amitha Domalpally
Abstract
Purpose: To train and assess an artificial intelligence (AI)-based OCT segmentation with the ability to segment 6 different layer edges. This created 5 different retinal thickness layers that are commonly requested for macular edema (ME) clinical trials.
Methods: Using volumetric macular OCT scans from the WRC training library, a 2D nnSAM model was trained to segment the inner limiting membrane (ILM), bottom of the retinal nerve fiber layer (RNFL), the top of the outer plexiform layer (OPL), ellipsoid zone (EZ), retinal pigment epithelium layer (RPE), and Bruch’s membrane (BM). 970 b-scans from 10 OCTs without retinal disease and 873 b-scans from 9 OCTs with ME were used as the training set. All b-scans were segmented in the Heidelberg Spectralis software and the segmentation was reviewed by trained reading center graders for accuracy. A custom application was created to view both the grader annotations and the AI annotations to assess for accuracy at a b-scan level. Thirty-nine volumetric OCT scans from the Month 1 from the Study of Comparative Treatments for Retinal Vein Occlusion 2 (SCORE2) clinical trial (NCT01969708) were used as the external test set. Accuracy was assessed using average retinal thickness in each of the ETDRS sectors as well as DICE for each retinal layer between the AI and the grader annotations.
Results: Mean CST was 344 µm (59.1) with AI and 343 µm (58.1) with grader using ILM – BM. Across ETDRS subfields, ILM–BM thickness had minimal difference between AI and grader (–5 to +6 µm). For additional thickness measurements, nerve fiber layer (ILM–RNFL) demonstrated close agreement (-3 – 2 µm); inner retinal thickness (RNFL–OPL) was consistently underestimated by the AI (up to 25 µm), while outer retinal thickness (OPL–EZ) was overestimated (12–33 µm). Photoreceptor thickness (EZ–RPE) showed near-perfect correspondence. Dice scores were high for all layers (0.90–0.95).
Conclusions: This AI model, without manual correction, performed similarly to reading-center-trained grader’s segmentation for six different retinal layer edges. Differences within each layer showed excellent agreement with differences between the AI and the grader’s segmentation considered within reproducibility range for a reading center. Expanding this model to be device agnostic could provide a universal and efficient method to segment multiple retinal layers requiring minimal human oversight or intervention.