A survey on deep learning based face detection

Vinh Tran  The; Nguyen Thi Khanh  Tien; Tran Kim  Thanh

doi:10.15276/aait.06.2023.15

Authors

The Vinh Tran Ho Chi Minh City University of Transport/. Ho Chi Minh City. Vietnam https://orcid.org/0000-0002-4241-1065
Nguyen Thi Khanh Tien Odessa Polytechnic National University, Odessa, Ukraine https://orcid.org/0000-0001-5379-7226
Tran Kim Thanh Ho Chi Minh City University of Transport. Ho Chi Minh City. Vietnam https://orcid.org/0000-0002-4241-1065

DOI:

https://doi.org/10.15276/aait.06.2023.15

Keywords:

Face detection, one-stage detector, two-stage detector, deep learning, single shot detector, multi-task cascaded convolutional neural networks, RetinaNet, YuNet

Abstract

The article has focused on surveying face detection models based on deep learning, specifically examining different one-stage models in order to determine how to choose the appropriate face detection model as well as propose a direction to enhance our face detection model to match the actual requirements of computer vision application systems related to the face. The face detection models that were conducted survey include single shot detector, multi-task cascaded convolution neural networks, RetinaNet, YuNet on the Wider Face dataset. Tasks during the survey are structural investigation of chosen models, conducting experimental surveys to evaluate the accuracy and performance of these models. To evaluate and provide criteria for choosing face detection suitable for the requirements, two indicators are used, average precision to evaluate accuracy and frames-per-second to evaluate performance. Experiential results were analyzed and used for making conclusions and suggestions for future work. For our real-time applications on face-related camera systems, such as driver monitoring system, supermarket security system (shoplifting warning, disorderly warning), attendance system, often require fast processing, but still ensures accuracy. The models currently applied in our system such as Yolos, Single Shot Detector, MobileNetv1 guarantee real-time processing, but most of these models have difficulty in detecting small faces in the frame and cases containing contexts, which are easily mistaken for a human face. Meanwhile, the RetinaNet_ResNet50 model brings the highest accuracy, especially to ensure the detection of small faces in the frame, but the processing time is larger. Therefore, through this survey, we propose an enhancement direction of the face detection model based on the RetinaNet structure with the goal of ensuring accuracy and reducing processing time

Downloads

Download data is not yet available.

Author Biographies

The Vinh Tran, Ho Chi Minh City University of Transport/. Ho Chi Minh City. Vietnam

Doctor of Philosophy, Senior Lecturer of the Department of Information Technology. Ho Chi Minh
City University of Transport/. Ho Chi Minh City. Vietnam

Scopus ID: 288641

Nguyen Thi Khanh Tien, Odessa Polytechnic National University, Odessa, Ukraine

Doctor of Philosophy, Senior Lecturer of Department of Information Technology (Ho Chi Minh
City University of Transport, Ho Chi Minh City, Vietnam), Senior Lecturer of Department of Information System (Odessa Polytechnic National University, Odessa, Ukraine)

Tran Kim Thanh, Ho Chi Minh City University of Transport. Ho Chi Minh City. Vietnam

Doctor of Philosophy, Senior Lecturer of Department of Information Technology, Ho Chi Minh City
University of Transport. Ho Chi Minh City. Vietnam

A survey on deep learning based face detection

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

The Vinh Tran, Ho Chi Minh City University of Transport/. Ho Chi Minh City. Vietnam

Nguyen Thi Khanh Tien, Odessa Polytechnic National University, Odessa, Ukraine

Tran Kim Thanh, Ho Chi Minh City University of Transport. Ho Chi Minh City. Vietnam

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Current Issue