Development of infrastructure  for anomalies detection in big data

Main Article Content

Iuliia L. Khlevna
Bohdan S. Koval

Abstract

The work describes the conducted analysis of models, methods, and technologies for detecting anomalies in data. It concludes that, based on the analysis, the solution to the problem of anomaly detection in data should be considered as a complex technology, which consists of the formation and application of mathematical models in combination with the research of data processing approaches. The article analyses the current state of big data stream processing technologies and reflects the peculiarities of the most commonly used and advanced of them, e.g. Apache Hadoop, Apache Spark, Apache Cassandra, Apache Kafka, Apache Storm, and Apache Beam. On top of these, it pays attention to the infrastructure, in which the created software models can be deployed and used, taking into account the high-load real-time nature of the data. The article proposes to form an infrastructure for anomaly detection in data as an applied example of big data processing cloud infrastructure. The paper demonstrates the developed infrastructure model for anomaly detection in real-time stream data, which is based on an expert method of forming requirements for a software component, choosing an algorithm for detecting anomalies, selecting tools, and improving the algorithm. The highlighted anomaly detection tools allow us to create a secure real-time anomaly detection solution using Dataflow, BigQuery ML, and Cloud DLP. The paper presents the applied implementation of anomaly detection in real-time using GCP and Apache Beam - data stream analysis of software logs in the information system and detection of fraudulent ones among them, which will help improve the cyber security of the system. In the end, the work demonstrates possible improvements to the basic model that could help to speed it up.

Downloads

Download data is not yet available.

Article Details

Topics

Section

Software engineering аnd systems analysis

Authors

Author Biographies

Iuliia L. Khlevna, Taras Shevchenko National University of Kyiv, 60, Volodymyrska Str. Kyiv, 01033, Ukraine

Doctor of Engineering Sciences, Associate Professor, Professor of the Department of Technologies Management

Scopus Author ID: 57191869873

Bohdan S. Koval, Taras Shevchenko National University of Kyiv. 60, Volodymyrska Str. Kyiv, 01033, Ukraine

postgraduate Department of Technology Management

Scopus Author ID: 57200141737

Similar Articles

You may also start an advanced similarity search for this article.