Development of a unified feature space for ensemble classification of polystructural heterogeneous small data

Main Article Content

Olena O. Arsirii
Oleksandr K. Andronati

Abstract

Relevance. The paper addresses the problem of classification of polystructural heterogeneous small data, which include structured tabular information and unstructured audio signals. Objective. The study proposes a unified methodological framework for constructing an ensemble classification system based on a common feature space that ensures compatibility between different data types and machine learning models. Main Research Material. A unified feature space model is developed, incorporating a linear representation for structured data and a three-level representation for audio data, including tabular spectral features, spectrogram-based image representation, and temporal sequences of features. To improve the quality of input data, a Backward Feature Elimination procedure is applied to adapt feature subsets to the specifics of individual classifiers and the removal of non-informative features. The classification framework is based on a stacking ensemble architecture that combines multiple base models, including classical machine learning algorithms and deep learning models. Three aggregation strategies are considered: hard voting, soft voting, and soft voting with Gompertz fuzzy ranking, which enables nonlinear adjustment of classifier probabilities and improves robustness under uncertainty. Results. Experimental evaluation was conducted on five datasets from different domains, including healthcare, finance, audio signal analysis, and deepfake detection. The results demonstrate that the proposed approach consistently improves classification performance compared to individual models. The application of feature selection and the integration of ensemble methods provides significant gains for polystructural data. Conclusions. The proposed model offers a flexible and scalable solution for handling heterogeneous small data and can be effectively applied across multiple domains, providing improved generalization, robustness to noise, and adaptability to different data representations.

Downloads

Download data is not yet available.

Article Details

Topics

Section

Computer science and software engineering

Authors

Author Biographies

Olena O. Arsirii , Odessa Polytechnic NationalUniversity. 1, Shevchenko Ave. Odesa, 65044, Ukraine

Doctor of Engineering Sciences, Professor, Head of Department of Information Systems

Scopus Author ID: 54419480900

Oleksandr K. Andronati , Odesa Polytechnic NationalUniversity. 1, Shevchenko Ave. Odesa, 65044, Ukraine

graduatestudent, Department of Information Systems

Scopus Author ID58677655800

Most read articles by the same author(s)

Similar Articles

You may also start an advanced similarity search for this article.