-
- Hongyuan Gao, Erin J Aiello Bowles, David Carrell, and Diana S M Buist.
- Group Health Research Institute, Seattle, WA, USA. Electronic address: Gao.h@ghc.org.
- J Biomed Inform. 2015 Apr 1; 54: 77-84.
ObjectiveStructured data on mammographic findings are difficult to obtain without manual review. We developed and evaluated a rule-based natural language processing (NLP) system to extract mammographic findings from free-text mammography reports.Materials And MethodsThe NLP system extracted four mammographic findings: mass, calcification, asymmetry, and architectural distortion, using a dictionary look-up method on 93,705 mammography reports from Group Health. Status annotations and anatomical location annotation were associated to each NLP detected finding through association rules. After excluding negated, uncertain, and historical findings, affirmative mentions of detected findings were summarized. Confidence flags were developed to denote reports with highly confident NLP results and reports with possible NLP errors. A random sample of 100 reports was manually abstracted to evaluate the accuracy of the system.ResultsThe NLP system correctly coded 96-99 out of our sample of 100 reports depending on findings. Measures of sensitivity, specificity and negative predictive values exceeded 0.92 for all findings. Positive predictive values were relatively low for some findings due to their low prevalence.DiscussionOur NLP system was implemented entirely in SAS Base, which makes it portable and easy to implement. It performed reasonably well with multiple applications, such as using confidence flags as a filter to improve the efficiency of manual review. Refinements of library and association rules, and testing on more diverse samples may further improve its performance.ConclusionOur NLP system successfully extracts clinically useful information from mammography reports. Moreover, SAS is a feasible platform for implementing NLP algorithms.Copyright © 2015 Elsevier Inc. All rights reserved.
Notes
Knowledge, pearl, summary or comment to share?You can also include formatting, links, images and footnotes in your notes
- Simple formatting can be added to notes, such as
*italics*
,_underline_
or**bold**
. - Superscript can be denoted by
<sup>text</sup>
and subscript<sub>text</sub>
. - Numbered or bulleted lists can be created using either numbered lines
1. 2. 3.
, hyphens-
or asterisks*
. - Links can be included with:
[my link to pubmed](http://pubmed.com)
- Images can be included with:
![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
- For footnotes use
[^1](This is a footnote.)
inline. - Or use an inline reference
[^1]
to refer to a longer footnote elseweher in the document[^1]: This is a long footnote.
.