Use the Back button in your browser to see the other results of your search or to select another record.
| Can ChatGPT4-vision identify radiologic progression of multiple sclerosis on brain MRI? [with consumer summary] |
| Kelly BS, Duignan S, Mathur P, Dillon H, Lee EH, Yeom KW, Keane PA, Lawlor A, Killeen RP |
| European Radiology Experimental 2025 Jan 15;9(1):9 |
| clinical trial |
| This trial has not yet been rated. |
|
BACKGROUND: The large language model ChatGPT can now accept image input with the GPT4-vision (GPT4V) version. We aimed to compare the performance of GPT4V to pretrained U-Net and vision transformer (ViT) models for the identification of the progression of multiple sclerosis (MS) on magnetic resonance imaging (MRI). METHODS: Paired coregistered MR images with and without progression were provided as input to ChatGPT4V in a zero-shot experiment to identify radiologic progression. Its performance was compared to pretrained U-Net and ViT models. Accuracy was the primary evaluation metric and 95% confidence interval (CIs) were calculated by bootstrapping. We included 170 patients with MS (50 males, 120 females), aged 21 to 74 years (mean 42.3), imaged at a single institution from 2019 to 2021, each with 2 to 5 MRI studies (496 in total). RESULTS: One hundred seventy patients were included, 110 for training, 30 for tuning, and 30 for testing; 100 unseen paired images were randomly selected from the test set for evaluation. Both U-Net and ViT had 94% (95% CI 89 to 98%) accuracy while GPT4V had 85% (77 to 91%). GPT4V gave cautious nonanswers in six cases. GPT4V had precision (specificity), recall (sensitivity), and F1 score of 89% (75 to 93%), 92% (82 to 98%), 91 (82 to 97%) compared to 100% (100 to 100%), 88 (78 to 96%), and 0.94 (88 to 98%) for U-Net and 94% (87 to 100%), 94 (88 to 100%), and 94 (89 to 98%) for ViT. CONCLUSION: The performance of GPT4V combined with its accessibility suggests has the potential to impact AI radiology research. However, misclassified cases and overly cautious non-answers confirm that it is not yet ready for clinical use. RELEVANCE STATEMENT: GPT4V can identify the radiologic progression of MS in a simplified experimental setting. However, GPT4V is not a medical device, and its widespread availability highlights the need for caution and education for lay users, especially those with limited access to expert healthcare.
|