Models of semantic analysis of product descriptions for automatic determination of customs codes
Main Article Content
Abstract
In today's global trade environment, accurate and fast classification of goods by Harmonized System (CUSTOMS/CUSTOMS) codes is crucial for the efficient functioning of customs processes. The increase in data volumes and the complexity of product descriptions create a need to implement intelligent methods for analyzing text information. The use of Natural Language Processing (NLP) technologies in combination with machine learning opens up new opportunities for automating the process of determining customs codes and reducing the human factor in customs classification. The aim of the study is to increase the accuracy of automated determination of Harmonized System codes by applying modern models of semantic analysis of text descriptions of goods. To achieve this goal, the following tasks were set: to analyze existing natural language processing models, to investigate their effectiveness for customs classification tasks, to form the optimal model architecture and to evaluate its advantages compared to traditional algorithms. The work uses semantic modeling and machine learning methods. An experimental approach was used to combine these models with classification algorithms, in particular, logistic regression, decision trees, and neural networks. The evaluation was carried out using accuracy, completeness, and measure indicators. The results of the study showed that the use of contextual embeddings, in particular the BERT model, provides a significant improvement in the accuracy of automated classification of goods compared to traditional statistical methods. The proposed generalized model, combining semantic analysis with machine learning, allows to increase the level of correct assignment of customs codes based on text descriptions, even in cases of ambiguous or incomplete data. The study confirmed the feasibility of integrating natural language processing technologies into customs classification systems. The scientific novelty lies in the development of a hybrid model that combines semantic text representations and classification algorithms, which increases the accuracy and efficiency of automated determination of customs codes. The practical significance of the work lies in the possibility of implementing the proposed approach in "smart customs" systems to optimize control processes and accelerate the clearance of goods.

