JAMA network open
-
Randomized Controlled Trial
Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.
Why care about LLM's?
Large language models (LLMs) have revolutionised natural language processing, and so inevitably have found their way into healthcare. Their use in decision support and diagnosis has however shown mixed results, even as models and integrations quickly improve.
Despite short-comings, LLMs cannot be ignored by doctors – growing health cost-demand-challenges will continue to push LLM-based tools into clinical practice, even before robust clinical validation. We also know that diagnostic errors are common and costly, both in economic and patient safety terms, increasing the allure of medical LLMs.
What did this study do?
This single-blinded randomised controlled trial included 50 physicians (26 attendings, 24 residents) from family medicine, internal medicine, and emergency medicine. Participants were randomised to either use ChatGPT-4 plus conventional resources or conventional resources only, to complete up to six clinical diagnostic cases within 60 minutes.
Diagnostic performance was measured using validated standardised scoring of three elements: accuracy of generated differential diagnoses, ability to identify supporting and contradicting clinical findings, and the appropriateness of proposed next diagnostic steps.
(Interesting aside: the six selected vignettes were from a 1994 pool of 105 never-published real patient cases originally used in a landmark study on diagnostic systems, guaranteed to be outside the LLM's training data, as these cases have been kept private to preserve their future testing validity.)
And they found?
The LLM alone performed significantly better than either physician group, scoring 16 percentage points higher than the control group (95% CI, 2-30 %-points). Yet physicians with access to the LLM effectively showed no improvement compared to the conventional-resources-alone group (76% vs 74% median diagnostic score, p=.60). Time spent per case was no different between groups.
"Access alone to LLMs will not improve overall physician diagnostic reasoning in practice. These findings are particularly relevant now that many health systems offer [HIPAA]–compliant chatbots ... often with no to minimal training..."
Bottom-line
This study highlights the "implementation gap" between AI capability and clinical utility: even if reliably and consistently accurate (a big 'if'), the mere availability of AI tools will not automatically translate into improved clinical reasoning. Successful integration will require deliberate consideration of how to optimise human-AI collaboration in medical practice.
summary -
Neuraxial labour analgesia for vaginal delivery is associated with a significant reduction in the risk of severe maternal morbidity.
pearl -
To improve health care price transparency and promote cost-conscious selection of health care organizations and practitioners, the Centers for Medicare & Medicaid Services (CMS) required that hospitals share payer-specific negotiated prices for selected shoppable health services by January 2021. While this regulation improves price transparency, it is unclear whether disclosed prices reflect total costs of care, since many hospital-based services are delivered and billed separately by independent practitioners or other health care entities. ⋯ This cross-sectional study found that independent practitioners were frequently involved in the delivery of shoppable hospital-based care, and their reimbursement may have represented a substantial portion of total costs of care. These findings suggest that disclosed hospital reimbursement was usually not correlated with total cost of care, limiting the potential benefits of the hospital price transparency rule for improving consumer decision-making.
-
The use of acellular dermal matrix (ADM) in implant-based breast reconstructions (IBBRs) is established practice. Existing evidence validating ADMs proposed advantages, including improved cosmetics and more single-stage IBBRs, is lacking. ⋯ Immediate IBBR with ADM did not yield fewer reoperations compared with conventional IBBR without ADM, nor was IBBR with ADM superior in terms of HRQoL or patient-reported cosmetic outcomes. Patients treated for breast cancer contemplating ADM-supported IBBR should be informed about the lack of evidence validating ADM's suggested benefits.
-
Increasing hospital costs for bronchiolitis have been associated with increasing patient complexity and mechanical ventilation. However, the associations of illness severity and diagnostic coding practices with bronchiolitis hospitalization costs have not been examined. ⋯ This cross-sectional study suggests that hospitalized children with bronchiolitis are receiving costlier and more intensive care without objective evidence of increasing severity of illness. Changes in coding practices may complicate efforts to study trends in the use of health care resources using administrative data.