The use of control theory methods in neural networks’ training based on a handwritten text

Authors

DOI:

https://doi.org/10.15276/aait.03.2021.3

Keywords:

Saddle Points, Neural Networks, Discrete Systems, Chaos Control

Abstract

The paper shows the importance of reducing the neural networks’ training time at present stage and the role of new
optimization methods in neural networks’ training. The paper researches a modification of stochastic gradient descent, which is based
on the idea of gradient descent representation as a discrete dynamical system. The connection between the extreme points, to which
the gradient descent iterations tend, and the stationary points of the corresponding discrete dynamical system is a consequence of this
representation. The further applied stabilizing scheme with predictive control, for which a theoretical apparatus was developed
by means of geometric complex analysis together with solving optimization tasks in a set of polynomials with real coefficients, was
able to train a multilevel perceptron for recognizing handwritten numbers many times faster. The new algorithm software
implementation used the PyTorch library, created for researches in the field of neural networks. All experiments were run on NVidia
graphical processing unit to check the processing unit’s resource consumption. The numerical experiments did not reveal any
deviation in training time. There was a slight increase in the used video memory, which was expected as the new algorithm retains
one additional copy of perceptron internal parameters. The importance of this result is associated with the growth in the use of deep
neural network technology, which has grown three hundred thousand times from 2012 till 2018, and the associated resource
consumption. This situation forces the industry to consider training optimization issues as well as their accuracy. Therefore, any
training process acceleration that reduces the time or resources of the clusters is a desirable and important result, which was achieved
in this article. The results obtained discover a new area of theoretical and practical research, since the stabilization used is only one
of the methods of stabilization and search for cycles in control theory. Such good practical results confirm the need to add the lagging
control and the additional experiments with both predictive and lagging control elements

Downloads

Download data is not yet available.

Author Biography

Andrii V. Smorodin, Odessa National Polytechnic University. 1, Shevchenko Ave. Odessa, 65044,Ukraine

PhD student of the Institute Computer Systems

Downloads

Published

2021-03-15

How to Cite

[1]
Smorodin A.V. “The use of control theory methods in neural networks’ training based on a handwritten text”. Applied Aspects of Information Technology. 2021; Vol. 4, No. 3: 243–249. DOI:https://doi.org/10.15276/aait.03.2021.3.