Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data

Victoria Snorovikhina, Alexey Zaytsev

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)


Fraud in healthcare is widespread, as doctors could prescribe unnecessary treatments to increase bills. Insurance companies want to detect these anomalous fraudulent bills and reduce their losses. Traditional fraud detection methods use expert rules and manual data processing. Recently, machine learning techniques automate this process, but hand-labeled data is extremely costly and usually out of date. We propose a machine learning model that automates fraud detection in an unsupervised way. Two deep learning approaches include LSTM neural network for prediction next patient visit and a seq2seq model. For normalization of produced anomaly scores, we propose Empirical Distribution Function (EDF) approach. So, the algorithm works with high class imbalance problems. We use real data on sequences of patients’ visits data from Allianz company for the validation. The models provide state-of-the-art results for unsupervised anomaly detection for fraud detection in healthcare. Our EDF approach further improves the quality of LSTM model.

Original languageEnglish
Title of host publicationAnalysis of Images, Social Networks and Texts - 9th International Conference, AIST 2020, Revised Selected Papers
EditorsWil M. van der Aalst, Vladimir Batagelj, Dmitry I. Ignatov, Michael Khachay, Olessia Koltsova, Andrey Kutuzov, Sergei O. Kuznetsov, Irina A. Lomazova, Natalia Loukachevitch, Amedeo Napoli, Alexander Panchenko, Panos M. Pardalos, Marcello Pelillo, Andrey V. Savchenko, Elena Tutubalina
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages13
ISBN (Print)9783030726096
Publication statusPublished - 2021
Event9th International Conference on Analysis of Images, Social Networks and Texts, AIST 2020 - Moscow, Russian Federation
Duration: 15 Oct 202016 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12602 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference9th International Conference on Analysis of Images, Social Networks and Texts, AIST 2020
Country/TerritoryRussian Federation


  • Deep learning
  • Discrete sequence data
  • Unsupervised anomaly detection


Dive into the research topics of 'Unsupervised Anomaly Detection for Discrete Sequence Healthcare Data'. Together they form a unique fingerprint.

Cite this