-
Comparative Study
Enhancing Diagnostic Support for Chiari Malformation and Syringomyelia: A Comparative Study of Contextualized ChatGPT Models.
- Ethan D L Brown, Max Ward, Apratim Maity, Mark A Mittler, Larry LoSheng-FuSFDepartment of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA., and Randy S D'Amico.
- Department of Neurologic Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New York, USA. Electronic address: ebrown35@northwell.edu.
- World Neurosurg. 2024 Sep 1; 189: e86e107e86-e107.
ObjectivesThe rapidly increasing adoption of large language models in medicine has drawn attention to potential applications within the field of neurosurgery. This study evaluates the effects of various contextualization methods on ChatGPT's ability to provide expert-consensus aligned recommendations on the diagnosis and management of Chiari Malformation and Syringomyelia.MethodsNative GPT4 and GPT4 models contextualized using various strategies were asked questions revised from the 2022 Chiari and Syringomyelia Consortium International Consensus Document. ChatGPT-provided responses were then compared to consensus statements using reviewer assessments of 1) responding to the prompt, 2) agreement of ChatGPT response with consensus statements, 3) recommendation to consult with a medical professional, and 4) presence of supplementary information. Flesch-Kincaid, SMOG, word count, and Gunning-Fog readability scores were calculated for each model using the quanteda package in R.ResultsRelative to GPT4, all contextualized GPTs demonstrated increased agreement with consensus statements. PDF+Prompting and Prompting models provided the most elevated agreement scores of 19 of 24 and 23 of 24, respectively, versus 9 of 24 for GPT4 (p=.021, p=.001). A trend toward improved readability was observed when comparing contextualized models at large to ChatGPT4, with significant decreases in average word count (180.7 vs 382.3, p<.001) and Flesch-Kincaid Reading Ease score (11.7 vs 17.2, p=.033).ConclusionsThe enhanced performance observed in response to ChatGPT4 contextualization suggests broader applications of large language models in neurosurgery than what the current literature indicates. This study provides proof of concept for the use of contextualized GPT models in neurosurgical contexts and showcases the easy accessibility of improved model performance.Copyright © 2024 Elsevier Inc. All rights reserved.
Notes
Knowledge, pearl, summary or comment to share?You can also include formatting, links, images and footnotes in your notes
- Simple formatting can be added to notes, such as
*italics*
,_underline_
or**bold**
. - Superscript can be denoted by
<sup>text</sup>
and subscript<sub>text</sub>
. - Numbered or bulleted lists can be created using either numbered lines
1. 2. 3.
, hyphens-
or asterisks*
. - Links can be included with:
[my link to pubmed](http://pubmed.com)
- Images can be included with:
![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
- For footnotes use
[^1](This is a footnote.)
inline. - Or use an inline reference
[^1]
to refer to a longer footnote elseweher in the document[^1]: This is a long footnote.
.