Technologies for complex intelligent clinical data analysis

A. A. Baranov, L. S. Namazova-Baranova, I. V. Smirnov, D. A. Devyatkin, A. O. Shelmanov, E. A. Vishneva, E. V. Antonova, V. I. Smirnov

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)


The paper presents the system for intelligent analysis of clinical information. Authors describe methods implemented in the system for clinical information retrieval, intelligent diagnostics of chronic diseases, patient's features importance and for detection of hidden dependencies between features. Results of the experimental evaluation of these methods are also presented. Background: Healthcare facilities generate a large flow of both structured and unstructured data which contain important information about patients. Test results are usually retained as structured data but some data is retained in the form of natural language texts (medical history, the results of physical examination, and the results of other examinations, such as ultrasound, ECG or X-ray studies). Many tasks arising in clinical practice can be automated applying methods for intelligent analysis of accumulated structured array and unstructured data that leads to improvement of the healthcare quality. Aims: the creation of the complex system for intelligent data analysis in the multi-disciplinary pediatric center. Materials and methods: Authors propose methods for information extraction from clinical texts in Russian. The methods are carried out on the basis of deep linguistic analysis. They retrieve terms of diseases, symptoms, areas of the body and drugs. The methods can recognize additional attributes such as «negation» (indicates that the disease is absent), «no patient» (indicates that the disease refers to the patient's family member, but not to the patient), «severity of illness», «disease course», «body region to which the disease refers». Authors use a set of hand-drawn templates and various techniques based on machine learning to retrieve information using a medical thesaurus. The extracted information is used to solve the problem of automatic diagnosis of chronic diseases. A machine learning method for classification of patients with similar nosology and the method for determining the most informative patients' features are also proposed. Results: Authors have processed anonymized health records from the pediatric center to estimate the proposed methods. The results show the applicability of the information extracted from the texts for solving practical problems. The records of patients with allergic, glomerular and rheumatic diseases were used for experimental assessment of the method of automatic diagnostic. Authors have also determined the most appropriate machine learning methods for classification of patients for each group of diseases, as well as the most informative disease signs. It has been found that using additional information extracted from clinical texts, together with structured data helps to improve the quality of diagnosis of chronic diseases. Authors have also obtained pattern combinations of signs of diseases. Conclusions: The proposed methods have been implemented in the intelligent data processing system for a multidisciplinary pediatric center. The experimental results show the availability of the system to improve the quality of pediatric healthcare.

Original languageEnglish
Pages (from-to)160-171
Number of pages12
JournalVestnik Rossiiskoi Akademii Meditsinskikh Nauk
Issue number2
Publication statusPublished - 2016
Externally publishedYes


  • Data mining in healthcare
  • Hospital information system
  • Information extraction
  • Natural language processing of clinical texts


Dive into the research topics of 'Technologies for complex intelligent clinical data analysis'. Together they form a unique fingerprint.

Cite this