• J Eval Clin Pract · Oct 2024

    Improving clinical abbreviation sense disambiguation using attention-based Bi-LSTM and hybrid balancing techniques in imbalanced datasets.

    • Manda Hosseini, Amir Hossein Rasekh, and Amin Keshavarzi.
    • Department of Computer Engineering, Zand Institute of Higher Education, Shiraz, Iran.
    • J Eval Clin Pract. 2024 Oct 1; 30 (7): 132713361327-1336.

    RationaleClinical abbreviations pose a challenge for clinical decision support systems due to their ambiguity. Additionally, clinical datasets often suffer from class imbalance, hindering the classification of such data. This imbalance leads to classifiers with low accuracy and high error rates. Traditional feature-engineered models struggle with this task, and class imbalance is a known factor that reduces the performance of neural network techniques.Aims And ObjectivesThis study proposes an attention-based bidirectional long short-term memory (Bi-LSTM) model to improve clinical abbreviation disambiguation in clinical documents. We aim to address the challenges of limited training data and class imbalance by employing data generation techniques like reverse substitution and data augmentation with synonym substitution.MethodWe utilise a Bi-LSTM classification model with an attention mechanism to disambiguate each abbreviation. The model's performance is evaluated based on accuracy for each abbreviation. To address the limitations of imbalanced data, we employ data generation techniques to create a more balanced dataset.ResultsThe evaluation results demonstrate that our data balancing technique significantly improves the model's accuracy by 2.08%. Furthermore, the proposed attention-based Bi-LSTM model achieves an accuracy of 96.09% on the UMN dataset, outperforming state-of-the-art results.ConclusionDeep neural network methods, particularly Bi-LSTM, offer promising alternatives to traditional feature-engineered models for clinical abbreviation disambiguation. By employing data generation techniques, we can address the challenges posed by limited-resource and imbalanced clinical datasets. This approach leads to a significant improvement in model accuracy for clinical abbreviation disambiguation tasks.© 2024 John Wiley & Sons Ltd.

      Pubmed     Copy Citation     Plaintext  

      Add institutional full text...

    Notes

     
    Knowledge, pearl, summary or comment to share?
    300 characters remaining
    help        
    You can also include formatting, links, images and footnotes in your notes
    • Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.
    • Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.
    • Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.
    • Links can be included with: [my link to pubmed](http://pubmed.com)
    • Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
    • For footnotes use [^1](This is a footnote.) inline.
    • Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..

    hide…