Rational data sampling for surrogate modeling of the stress-strain state of plate elements

Main Article Content

Roman L. Onatskyi
Seghii Yu. Misiura

Abstract

Background. Surrogate models based on machine learning are a promising alternative to computationally expensive finite element analysis for predicting the stress-strain state of structural elements. However, their efficiency critically depends on the quality of training data, while most existing approaches rely on abstract stochastic sampling methods that ignore the physical nature of engineering parameters. Aim. The aim of the study is to conduct a comparative analysis of the efficiency of stochastic versus physically-oriented data sampling strategies for the surrogate modelling of the stress-strain state of plate elements, with the focus shifted from increasing model complexity to developing an approach for rational dataset formation that integrates a priori engineering knowledge. Methods. Using the surrogate modelling of thin rigidly clamped plate deflection as an example, two fundamentally different sampling strategies are compared: the classical strategy, based on Sobol quasi-random sequences, and the proposed rational strategy. The latter is based on the real empirical properties of structural steels and accounts for the standardized plate thicknesses of sheet metal assortments. To evaluate and quantitatively compare the impact of the data structure on the precision of the prediction, an intelligent modelling procedure was developed. The following machine learning algorithms were used as test models for the regression task: K-Nearest Neighbours, Random Forest, XGBoost, Gaussian Process Regression, and Multilayer Perceptron. Results. It was found that switching to the rational strategy provides a consistent reduction in mean squared error across all evaluated models by three and a half to four times compared to the baseline strategy. Using the rational dataset increased the coefficient of determination by twenty-seven percent for k-nearest neighbors, twenty-three percent for random forest, and twenty-one percent for XGBoost. The highest precision was achieved by the models of gaussian process regression (with a coefficient of determination of ninety nine percent) and multilayer perceptron (with a coefficient of determination of ninety eight percent). It is proven that the low efficiency of traditional samples is due to the fact that more than fifty percent of the samples contain combinations of geometric dimensions and material properties that do not correspond to any real engineering standards. These unrealistic combinations create “information noise” during training. Conclusions. The proposed rational sampling approach serves as a basis for creating robust AI tools for the rapid diagnostics of engineering structures, providing high model generalisation capability with a smaller volume of input data.

Downloads

Download data is not yet available.

Article Details

Topics

Section

Computer science and software engineering

Authors

Author Biographies

Roman L. Onatskyi , National Technical University “Kharkiv Polytechnic Institute”, 2, Kyrpychova Str. Kharkiv, 61002, Ukraine

PhD student Department of Mathematical Modelling and Intelligent Computing in Engineering

Seghii Yu. Misiura, National Technical University “Kharkiv Polytechnic Institute”, 2, Kyrpychova Str. Kharkiv, 61002, Ukraine

Candidate of Engineering Sciences, Senior researcher, Associate Professor, Department of Mathematical Modelling and Intelligent Computing in Engineering

Scopus Author ID: 56416492100

Similar Articles

You may also start an advanced similarity search for this article.