Performance of a Deep Learning Model vs Human Reviewers in

JAMA network open · May 2019

Performance of a Deep Learning Model vs Human Reviewers in Grading Endoscopic Disease Severity of Patients With Ulcerative Colitis.

Assessing endoscopic disease severity in ulcerative colitis (UC) is a key element in determining therapeutic response, but its use in clinical practice is limited by the requirement for experienced human reviewers. ⋯ This study found that deep learning model performance was similar to experienced human reviewers in grading endoscopic severity of UC. Given its scalability, this approach could improve the use of colonoscopy for UC in both research and routine practice.

read on… or not…
- Ryan W Stidham, Wenshuo Liu, Shrinivas Bishu, Michael D Rice, HigginsPeter D RPDRDivision of Gastroenterology and Hepatology, Department of Internal Medicine, University of Michigan, Ann Arbor., Ji Zhu, Brahmajee K Nallamothu, and Akbar K Waljee.
- Michigan Integrated Center for Health Analytics and Medical Prediction (MiCHAMP), University of Michigan, Ann Arbor.
- JAMA Netw Open. 2019 May 3; 2 (5): e193963.
ImportanceAssessing endoscopic disease severity in ulcerative colitis (UC) is a key element in determining therapeutic response, but its use in clinical practice is limited by the requirement for experienced human reviewers.ObjectiveTo determine whether deep learning models can grade the endoscopic severity of UC as well as experienced human reviewers.Design, Setting, And ParticipantsIn this diagnostic study, retrospective grading of endoscopic images using the 4-level Mayo subscore was performed by 2 independent reviewers with score discrepancies adjudicated by a third reviewer. Using 16 514 images from 3082 patients with UC who underwent colonoscopy at a single tertiary care referral center in the United States between January 1, 2007, and December 31, 2017, a 159-layer convolutional neural network (CNN) was constructed as a deep learning model to train and categorize images into 2 clinically relevant groups: remission (Mayo subscore 0 or 1) and moderate to severe disease (Mayo subscore, 2 or 3). Ninety percent of the cohort was used to build the model and 10% was used to test it; the process was repeated 10 times. A set of 30 full-motion colonoscopy videos, unseen by the model, was then used for external validation to mimic real-world application.Main Outcomes And MeasuresModel performance was assessed using area under the receiver operating curve (AUROC), sensitivity and specificity, positive predictive value (PPV), and negative predictive value (NPV). Kappa statistics (κ) were used to measure agreement of the CNN relative to adjudicated human reference cores.ResultsThe authors included 16 514 images from 3082 unique patients (median [IQR] age, 41.3 [26.1-61.8] years, 1678 [54.4%] female), with 3980 images (24.1%) classified as moderate-to-severe disease by the adjudicated reference score. The CNN was excellent for distinguishing endoscopic remission from moderate-to-severe disease with an AUROC of 0.966 (95% CI, 0.967-0.972); a PPV of 0.87 (95% CI, 0.85-0.88) with a sensitivity of 83.0% (95% CI, 80.8%-85.4%) and specificty of 96.0% (95% CI, 95.1%-97.1%); and NPV of 0.94 (95% CI, 0.93-0.95). Weighted κ agreement between the CNN and the adjudicated reference score was also good for identifying exact Mayo subscores (κ = 0.84; 95% CI, 0.83-0.86) and was similar to the agreement between experienced reviewers (κ = 0.86; 95% CI, 0.85-0.87). Applying the CNN to entire colonoscopy videos had similar accuracy for identifying moderate to severe disease (AUROC, 0.97; 95% CI, 0.963-0.969).Conclusions And RelevanceThis study found that deep learning model performance was similar to experienced human reviewers in grading endoscopic severity of UC. Given its scalability, this approach could improve the use of colonoscopy for UC in both research and routine practice.

Pubmed Free full text Copy Citation Plaintext

Add institutional full text...
Notes
Knowledge, pearl, summary or comment to share?

300 characters remaining

help

You can also include formatting, links, images and footnotes in your notes

Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.

Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.

Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.

Links can be included with: [my link to pubmed](http://pubmed.com)

Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")

For footnotes use [^1](This is a footnote.) inline.

Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..
hide…

Performance of a Deep Learning Model vs Human Reviewers in Grading Endoscopic Disease Severity of Patients With Ulcerative Colitis.

Notes

300 characters remaining

help

You can also include formatting, links, images and footnotes in your notes

What will the 'Medical Journal of You' look like?

Start your free 21 day trial now.