Introduction
Ophthalmology, the realm of vision, is embracing the technological wave with AI chatbots emerging as potential game-changers. These virtual assistants hold immense promise for enhancing patient care, improving accessibility, and augmenting medical expertise.1
Patient education and triage: Chatbots can offer 24/7 access to reliable eye health information, empowering patients to understand symptoms, manage appointments, and even perform preliminary self-triage.2
Glaucoma management: Chatbots can remind patients about medication adherence, answer questions about drops, and provide emotional support, potentially leading to better glaucoma control.3
Physician support: Chatbots can handle routine inquiries, freeing up ophthalmologists’ time for complex cases and consultations 4, 5, 6 They can also analyze vast amounts of data to aid in diagnosis and treatment planning.
Imagine a world where patients with blurry vision, eye pain, or sudden flashes of light could instantly access a trusted advisor, capable of analyzing their symptoms, suggesting potential diagnoses, and even guiding them towards the appropriate specialist. 7
This is no longer science fiction, but the potential of AI chatbots in ophthalmic care
Globally, nearly 2.2 billion people have vision impairment, with preventable blindness affecting millions. Precise triage and referral are crucial to saving sight and preventing complication 8, 9
Every year, millions of people visit ophthalmologists with eye problems, yet accurate diagnosis and timely referrals remain critical challenges. Could AI chatbots be the answer?
This cross-sectional study aimed to:
Evaluate the accuracy of AI Chatbot’s in identifying and prioritizing diagnoses from ophthalmic clinical vignettes.
Compare the performance of AI Chatbot’s with human ophthalmology trainees in triage and referral recommendations.
Assess the potential of AI Chatbot’s to support clinical decision-making in ophthalmology.
Materials and Methods
One hundred clinical vignettes encompassing a spectrum of common ophthalmic conditions encountered in a tertiary care outpatient department were meticulously crafted. These vignettes included diverse presentations, varying degrees of severity, and pertinent historical details to mimic real-world patient encounters. Conditions covered included cataracts, glaucoma, diabetic retinopathy, age-related macular degeneration, corneal abnormalities, uveitis, and strabismus. Each vignette adhered to a standardized format, presenting patient demographics, chief complaint, presenting symptoms, past medical history, ophthalmic history, family history, and relevant social history.
Participant selection
Four groups of participants were recruited for this study
Ophthalmology trainees
A group of six ophthalmology trainees at various were recruited to represent the perspective of junior physicians encountering various ophthalmic conditions in the OPD setting
Open AI Chat GPT (GPT 3.5)
This state-of-the-art generative pre-trained transformer model was employed, recognized for its proficiency in understanding and responding to natural language queries.
Standardized prompt
To ensure consistency and comparability, all participants were presented with the same standardized prompt for each vignette. This prompt included the patient’s demographics, presenting complaint, current symptoms, past medical and ophthalmic history, medications, and any available diagnostic test results. Participants were instructed to:
Generate a list of possible diagnoses: They were asked to identify the most likely diagnosis based on the provided information and rank them in order of probability.
Recommend a management plan: This included suggestions for further investigations, referrals to specialists if necessary, and initial treatment recommendations.
Data collection and analysis
Responses from all participants were collected and anonymized. Diagnoses listed by each participant were compared against a pre-defined “gold standard” diagnosis established by a panel of senior ophthalmologists. Accuracy was measured by calculating the percentage of cases where the true diagnosis was listed among the top three suggestions. Management plans were assessed for alignment with established referral guidelines and best practices in ophthalmic care. Inter-rater reliability among physician respondents was evaluated using Cohen’s kappa coefficient.
Endpoint and scoring system
Diagnosis
-
Top 3 Diagnosis
For each vignette, participants will provide their top 3 most likely diagnoses.
Scoring: Each diagnosis will be compared to a pre-defined list of established, correct diagnoses for the vignette.
Exact Match: 3 points
Partially Correct: 2 points (e.g., if the participant lists a specific subtype of the correct diagnosis)
Incorrect: 0 points
-
Triage
Triage Category: Participants will categorize the urgency of the case based on the presented symptoms and potential diagnosis.
Urgent: Requires immediate specialist consultation or intervention.
Semi-urgent: Requires referral within a specific timeframe (e.g., within 24 hours).
Non-urgent: Can be managed by a primary care physician or scheduled for a follow-up appointment.
-
Referral
Results
Diagnosis accuracy - Physician respondents listed the appropriate diagnosis among the top three suggestions in 95% of cases. Google Gemini correctly identified the diagnosis in 90% of cases, followed by ChatGPT at 85% and WebMD at 20%. High concordance was observed between physician and AI recommendations for investigations and referrals.
-
Traige accuracy
ChatGPT’s triage accuracy was slightly lower, with correct categorization in 87 (87% of the cases
WebMD’s performance in triage was significantly weaker, with accurate urgency classification in only 35 (35%) of the cases.
Table 1
Discussion
The present study evaluated the triage ability of GPT technology using ChatGPT and Google Gemini across a wide range of ophthalmic conditions. The results demonstrated high diagnostic and triage accuracy for Google Gemini, comparable to that of physicians. Both ChatGPT and Gemini outperformed the existing online medical triage service, WebMD Symptom Tracker. These findings suggest that AI chatbots can serve as valuable adjuncts to human expertise in ophthalmology. 10
These results highlight the potential of AI chatbots to:
Augment clinician decision-making
By providing accurate preliminary diagnoses and suggesting appropriate referrals, AI can streamline the triage process and improve efficiency.
Early and accurate diagnosis allows for timely intervention and potentially better patient outcomes. AI chatbots can also empower patients with information and guide them towards seeking appropriate medical attention.11
AI-powered tools can provide basic triage and information in underserved areas or after clinic hours, potentially reducing healthcare disparities.12, 13, 14, 15
Conclusion
This study demonstrates the promising potential of AI chatbots in supporting triage and referral decisions for ophthalmic conditions. While human expertise remains paramount, AI tools can serve as valuable adjuncts, enhancing diagnostic accuracy, efficiency, and patient care. Future research should focus on refining AI algorithms, integrating clinical data, and exploring real-world implementation strategies.
Limitations
Single-center study: The findings may not be generalizable to other healthcare settings.
Clinical vignettes: Real-world patient consultations can be more complex, potentially impacting AI accuracy.
Limited scope: The study focused on common conditions. Performance with rarer cases needs further evaluation.