• J Biomed Inform · Oct 2016

    Use of "off-the-shelf" information extraction algorithms in clinical informatics: A feasibility study of MetaMap annotation of Italian medical notes.

    • Emma Chiaramello, Francesco Pinciroli, Alberico Bonalumi, Angelo Caroli, and Gabriella Tognola.
    • Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni (IEIIT), Consiglio Nazionale delle Ricerche (CNR), Piazza L. da Vinci, 32, 20133 Milano, Italy.
    • J Biomed Inform. 2016 Oct 1; 63: 22-32.

    AbstractInformation extraction from narrative clinical notes is useful for patient care, as well as for secondary use of medical data, for research or clinical purposes. Many studies focused on information extraction from English clinical texts, but less dealt with clinical notes in languages other than English. This study tested the feasibility of using "off the shelf" information extraction algorithms to identify medical concepts from Italian clinical notes. Among all the available and well-established information extraction algorithms, we used MetaMap to map medical concepts to the Unified Medical Language System (UMLS). The study addressed two questions: (Q1) to understand if it would be possible to properly map medical terms found in clinical notes and related to the semantic group of "Disorders" to the Italian UMLS resources; (Q2) to investigate if it would be feasible to use MetaMap as it is to extract these medical concepts from Italian clinical notes. We performed three experiments: in EXP1, we investigated how many medical concepts of the "Disorders" semantic group found in a set of clinical notes written in Italian could be mapped to the UMLS Italian medical sources; in EXP2 we assessed how the different processing steps used by MetaMap, which are English dependent, could be used in Italian texts to map the original clinical notes on the Italian UMLS sources; in EXP3 we automatically translated the clinical notes from Italian to English using Google Translator, and then we used MetaMap to map the translated texts. Results in EXP1 showed that the Italian UMLS Metathesaurus sources covered 91% of the medical terms of the "Disorders" semantic group, as found in the studied dataset. We observed that even if MetaMap was built to analyze texts written in English, most of its processing steps worked properly also with texts written in Italian. MetaMap identified correctly about half of the concepts in the Italian clinical notes. Using MetaMap's annotation on Italian clinical notes instead of a simple text search improved our results of about 15 percentage points. MetaMap's annotation of Italian clinical notes showed recall, precision and F-measure equal to 0.53, 0.98 and 0.69, respectively. Most of the failures were due to the impossibility for MetaMap to generate meaningful variants for the Italian language, suggesting that modifying MetaMap to allow generating Italian variants could improve the performance. MetaMap's performance in annotating automatically translated English clinical notes was in line with findings in the literature, with similar recall (0.75), F-measure (0.83) and even higher precision (0.95). Most of the failures were due to a bad Italian to English translation of medical terms, suggesting that using an automatic translation tool specialized in translating medical concepts might be useful to obtain better performances. In conclusion, performances obtained using MetaMap on the fully automatic translation of the Italian text are good enough to allow to use MetaMap "as it is" in clinical practice.Copyright © 2016 Elsevier Inc. All rights reserved.

      Pubmed     Free full text   Copy Citation     Plaintext  

      Add institutional full text...

    Notes

     
    Knowledge, pearl, summary or comment to share?
    300 characters remaining
    help        
    You can also include formatting, links, images and footnotes in your notes
    • Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.
    • Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.
    • Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.
    • Links can be included with: [my link to pubmed](http://pubmed.com)
    • Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
    • For footnotes use [^1](This is a footnote.) inline.
    • Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..

    hide…