The Here and Now for AI in Ophthalmology, In Focus at ARVO 2024

The opening of the artificial intelligence floodgates has inundated eye care with possibilities both practical and far-flung. Experts at an ARVO 2024 Day Three artificial intelligence symposium eschewed flights of futuristic fancy to talk about the current state of practicable AI solutions in ophthalmology.

Artificial intelligence in ophthalmology has hit its stride, but is burnout setting in? The deluge of reports on the next supposedly transformative AI-enabled innovation comes daily. Fatigue over the ratio of fluff to real, deployable AI solutions is real.

Cutting through fluff was high on the agenda at an artificial intelligence symposium on Day Three of the Annual Meeting of the Association for Research in Vision and Ophthalmology (ARVO 2024). Lecturers took a decidedly practical approach to where we are in ophthalmology with artificial intelligence—and what eye care practitioners need to know about how this technology is transforming the way we think—and write—about vision science.

The new norm in ophthalmic publishing

Perhaps the most immediately recognizable impact that AI and large language models (LLMs) have had on ophthalmology is in the realm of published work.

One of ChatGPT’s most (in)famous uses has been to assist busy students and academics in every part of the writing process, from outlining to peer review. But this powerful tool for academic writing comes with some serious caveats, and Drs. Neil Bressler (editor-in-chief of JAMA Ophthalmology) and Roy Chuck (editor-in-chief of ARVO’s Translational Vision Science and Technology) discussed how ophthalmic publishers and writers must adapt to this new normal.

Both Dr. Bressler and Dr. Chuck believe that AI’s Wild West period is over, and the time for regulating its use is now. “Artificial intelligence is a tool to be embraced and to be learned from,” said Dr. Bressler. “But if you’re using it for writing or reporting, guidelines must be put together.”

Chatbots cannot author a paper, argued Dr. Bressler. He raised arguments from computer ethicists that machines cannot be held accountable for all aspects of a given piece of work and do not have integrity—both of which are required for full authorship.

Guidelines for writing using AI chatbots, then, must be put into use—and have been, by organizations like ARVO for prospective study authors. According to these regulations, authors using artificial intelligence are fully responsible for anything they submit, including intentional and unintentional plagiarism.

Most guidelines also require full disclosure in a paper’s acknowledgements section of the exact portions generated by artificial intelligence and information about the model and version used.

Using AI chatbots for peer review is another thorny issue, but one on which Dr. Bressler and Dr. Chuck were unequivocal. Because of the murky ways in which AI models collect, train on and use the data being inputted into prompts, the inputting of any paper into a chatbot is considered a violation of the strict confidentiality agreements that peer reviews are beholden to.

In the end, both doctors were both cautious and optimistic about the way artificial intelligence is changing ophthalmic research reporting and writing. Dr. Chuck ended his talk by quoting a seminal paper by Ophthalmology editor-in-chief Dr. Russell Van Gelder.

“The real Turing test is whether a 7th-grade teacher can tell the difference between a ChatGPT-derived essay and the human-generated equivalent,” Dr. Van Gelder wrote, referencing a test in which a machine intelligence is challenged to exhibit behavior to a human interlocutor indistinguishable from another human. “I suspect this test is occurring hourly.”¹

Under the hood of AI in ophthalmology

Generative AI models like ChatGPT and Gemini have made headlines recently with their ability to perform on exams. But it’s what lies beneath that is most important to Dr. Zhiyong Lu, a senior investigator at the United States National Institutes of Health (NIH) and the National Library of Medicine (NLM).

Correct answers on exams are one thing, but demonstrating the rationale behind them is the most critical, according to Dr. Lu. “If you’re a physician, you’re not going to just trust ChatGPT and use this answer for your clinical decision,” he explained. “You want to know how ChatGPT got the answer.”

To investigate this, he took the multimodal ChatGPT 4V model, asked it ophthalmology exam questions, and compared the performance to med student and physician performance in both closed- and open-book (i.e. Google search) environments.

Dr. Lu’s group found that in almost a third of the cases of right answers, the model’s rationale was flawed, and figuring out why—and how to avoid such faulty reasoning—has become one of the major avenues of his lab’s research

The final talk of the day echoed this theme. Google researcher Dr. Yun Liu shed more light on what’s behind the curtain of the seeming wizardry of AI LLMs with his work on artificial intelligence at Google Health and its analysis of medical images.

The team at Google has been using external images of the eye—some even taken with a smartphone—as a way to predict systemic biomarkers for a vast variety of diseases, including diabetes. Dr. Liu explained how he was able to train the model to explain its rationale via text and by pointing out what parts of the image it based its decision on.

While this yielded humorous results—like the model predicting lower hemoglobin because of the presence of eyeliner and its assumption that the subject was female—the results were encouraging.

“AI can help detect biomarkers in external eye images with a level of accuracy much higher than the baseline models we compared it to,” he said at the conclusion of his talk. “And generative AI as such can help you extract insights from these data sets and models that could be helpful to teach people new knowledge,” he said.

Pearse Keane and the promise of open-source AI

It wasn’t all warnings and theory, however. Despite AI’s lofty promises and the pitfalls of artificial intelligence in reality, the technology is being used right now to save sight. Prof. Pearse Keane (University College London and Moorfields Eye Hospital, United Kingdom), one of the leading lights in ophthalmic artificial intelligence, presented the success of his foundation model, RETFound.

Trained on a dataset of 1.6 million images from Moorfields Eye Hospital, this completely open-source model² is already being used around the globe thanks to its key design feature—the ability to be fine-tuned to adjust to virtually any downstream task. His team has demonstrated training the model for more than ten different downstream clinical tasks.

What is special about RETFound is its flexibility, according to Prof. Keane. It can be self-trained on new, unlabeled data to help reduce bias, and can be asked about a wide variety of diseases.

In the future, Prof. Keane hopes to enhance the model’s ability to be fine-tuned to local databases around the world, integrate support for a variety of imaging modalities and improve its horizons with a larger dataset.

But Prof. Keane’s biggest hope is perhaps the shared hope of all invested in the future of AI and its benefits for humanity. Whether it is a model being used right now or one whose potential lies in the future, equity in access to fruit that AI bears is essential.

“It’s important for RETFound to be open source because we’d like this to be a cornerstone for global efforts to use AI to prevent blindness,” he said. “We hope that other people can take our work and improve on it, criticize it, find flaws in it, and hopefully do interesting things with it.”

Editor’s Note: The Annual Meeting of the Association for Research in Vision and Ophthalmology (ARVO 2024) is being held from 5-9 May in Seattle, Washington, USA. Reporting for this story took place during the event.

References

Van Gelder RN. The Pros and Cons of Artificial Intelligence Authorship in Ophthalmology. Ophthalmology. 2023;130(7):670-671.
Zhou Y, Chia MA, Wagner SK, et al. A foundation model for generalizable disease detection from retinal images. Nature. 2023;622(7981):156-163. [Epub 2023 Sep 13]