A hybrid data mining model for diagnosis of patients with clinical suspicion of dementia.
Read the original article.
Moreira LB1, Namen AA2
- Postgraduate Program in Cognition and Language, North Fluminense State University – UENF, Av. Alberto Lamego, 2000 – Parque Califórnia – CEP 28013-602, Campos dos Goitacazes, Rio de Janeiro, Brazil; Computer Modelling Department, State of Rio de Janeiro University, Rua Bonfim, 25 – Vila Amélia – CEP 28625-570 – Nova Friburgo, Rio de Janeiro, Brazil. Electronic address: email@example.com.
- Computer Modelling Department, State of Rio de Janeiro University, Rua Bonfim, 25 – Vila Amélia – CEP 28625-570 – Nova Friburgo, Rio de Janeiro, Brazil; Veiga de Almeida University, Rua Ibituruna, 108 – Maracanã – CEP 20271-020, Rio de Janeiro, Brazil. Electronic address: firstname.lastname@example.org.
BACKGROUND AND OBJECTIVE: Given the phenomenon of aging population, dementias arise as a complex health problem throughout the world. Several methods of machine learning have been applied to the task of predicting dementias. Given its diagnostic complexity, the great challenge lies in distinguishing patients with some type of dementia from healthy people. Particularly in the early stages, the diagnosis positively impacts the quality of life of both the patient and the family. This work presents a hybrid data mining model, involving the mining of texts integrated to the mining of structured data. This model aims to assist specialists in the diagnosis of patients with clinical suspicion of dementia.
METHODS: The experiments were conducted from a set of 605 medical records with 19 different attributes about patients with cognitive decline reports. Firstly, a new structured attribute was created from a text mining process. It was the result of clustering the patient’s pathological history information stored in an unstructured textual attribute. Classification algorithms (naïve Bayes, Bayesian belief networks and decision trees) were applied to obtain Alzheimer’s disease and mild cognitive impairment predictive models. Ensemble methods (Bagging, Boosting and Random Forests) were used in order to improve the accuracy of the generated models. These methods were applied in two datasets: one containing only the original structured data; the other containing the original structured data with the inclusion of the new attribute resulting from the text mining (hybrid model).
RESULTS: The models’ accuracy metrics obtained from the two different datasets were compared. The results evidenced the greater effectiveness of the hybrid model in the diagnostic prediction for the pathologies of interest.
CONCLUSIONS: When analyzing the different methods of classification and clustering used, the better rates related to the precision and sensitivity of the pathologies under study were obtained with hybrid models with support of ensemble methods.
Copyright © 2018. Published by Elsevier B.V.