Human action analysis models in artificial intelligence based proctoring systems and dataset for them
Main Article Content
Abstract
This paper describes the approach for building a specialized model for human action analysis in AI-based proctoring systems and proposes a prototype of dataset which contains data specific to the application area. Boosted development of machine learning technologies, the availability of devices and the access to the Internet are skyrocketing the development of the field of distance learning. And in parallel with distance learning systems the AI-based proctoring systems, that provide the functional analysis of student work by imitating the teacher's assessment, are developing as well. However, despite the development of image processing and machine learning technology, the functionality of modern proctoring systems is still at a primitive level. Within the image processing functionality, they focus entirely on tracking students' faces and do not track postures and actions. At the same time, assessment of physical activity is necessary not only as part of the learning process, but also to keep students healthy according to regulatory requirements, as they spend the entire duration of learning process in front of computers or other devices during the distance learning. In existing implementations, this process falls entirely on the shoulders of teachers or even the students themselves, who work through the lesson materials or tests on their own. Teachers, at the same time, have to either establish contact through video communication systems and social media (TikTok, Instagram) and/or analyse videos of students doing certain physical activities in order to organise physical activities evaluation. The lack of such functionality in AI-based proctoring systems slows down the learning process and potentially harms students' health in the long run. This paper presents additional functionality requirements for AI-based proctoring systems including human action analysis functionality to assess physical activity and to monitor hygiene rules for working with computers during the educational process. For this purpose, a foundation model called InternVideo was used for processing and analysis of student's actions. Based on it, the approach for building a specialized model for student action analysis was proposed. It includes two modes of student activity evaluation during the distance learning process: static and dynamic. The static mode (aka working phase) analyses and evaluates the student's behavior during the learning and examination process, where physical activity is not the main component of learning. The dynamic mode (aka physical education mode) analyses and assesses the student who purposefully performs physical activity (physical education lesson, exercises for children during the lesson, etc.). A prototype dataset designed specifically for this application area has also been proposed.