Challenges for Artificial Intelligence-based Model Development in Neovascular Age-related Macular Degeneration
Highlighting challenges in the field to date, no studies were identified in our literature search that evaluated whether artificial intelligence-based models could predict the treatment regimen required for an 'optimal' visual response for an individual patient. Thus far, studies have largely explored anti-VEGF treatment response, either indirectly, by studying association between OCT parameters and vision outcomes, or directly, by approaching the question of whether treatment response could be predicted based on retinal images.
A well-known issue in machine learning is that artificial intelligence-based models reflect the biases inherent to the datasets used to develop them. Unfortunately, in nAMD, few large datasets with high-quality spectral-domain OCT data are available, and these same datasets have been utilized by multiple groups for training, tuning, and testing of artificial intelligence-based models. This has also limited the nAMD population characteristics included in the models to date. Clinical trial populations, defined by specific inclusion and exclusion criteria, are generally more homogenous and less demographically diverse than real-world populations. In contrast, real-world patient populations, such as the Moorfields AMD database, have larger variability in demographics, disease state and severity, treatment approaches, and OCT imaging schedule and protocols.
Lack of counterfactual data is another significant limitation for both model development and judging a model's ability to predict treatment needs. In the context of nAMD, each patient is unique in their disease, baseline clinical presentation, and treatment response; it may be argued that a specific pretreatment state cannot be recreated. Therefore, it may not be possible to assess how that patient may have fared with an alternative treatment strategy or to ascertain their best-achievable vision outcomes. A potential strategy to mitigate this limitation is to ensure that large, diverse patient populations with accurate, thorough data are used for model development. Absenting that, artificial intelligence-based models may not accurately apply to individual patients and will carry forward biases of the datasets used to build them.
Finally, absence of OCT data standards impacts both availability of high-quality datasets for artificial intelligence model development and generalizability of these models. As a result, models created to date are generally device specific, impairing their broader application to clinical practice where different OCT devices are in use. To be useful in clinical practice, artificial intelligence-based models will need to be designed for functionality and interpretability outside of controlled research settings.
Curr Opin Ophthalmol. 2021;32(5):389-396. © 2021 Lippincott Williams & Wilkins