Journal of biomedical informatics
-
Instruments rating risk of harm to self and others are widely used in inpatient forensic psychiatry settings. A potential alternate or supplementary means of risk prediction is from the automated analysis of case notes in Electronic Health Records (EHRs) using Natural Language Processing (NLP). This exploratory study rated presence or absence and frequency of words in a forensic EHR dataset, comparing four reference dictionaries. Seven machine learning algorithms and different time periods of EHR analysis were used to probe which dictionary and which time period were most predictive of risk assessment scores on validated instruments. ⋯ NLP, used in conjunction with NLP dictionaries and machine learning, predicted risk ratings on the HCR-20, START, and DASA, based on EHR content. Further research is required to ascertain the utility of NLP approaches in predicting endpoints of actual self-harm, harm to others or victimisation.