A new deep learning approach based on grayscale conversion and DWT for object detection on adversarial attacked images

Taşyürek M., Gül E.

JOURNAL OF SUPERCOMPUTING, vol.79, no.18, pp.20383-20416, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 79 Issue: 18
  • Publication Date: 2023
  • Doi Number: 10.1007/s11227-023-05456-0
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
  • Page Numbers: pp.20383-20416
  • Keywords: Data augmentation, Deep learning, DWT, FGSM, Object detection, PGD
  • Kayseri University Affiliated: Yes


In recent years, access to digital images has been made easy, and the use and distribution of images have increased rapidly. Therefore, undesirable adversarial attacks such as FGSM and PGD can occur during the sharing or distribution of images. On the other hand, object detection, one of the widely used applications of deep learning in real life, is used to scan digital images to locate instances of every object. However, the CNN and transformer-based deep learning models trained using classical data augmentation may be unsatisfactory in obtaining the required object detection accuracy on attacked images. Therefore, to overcome the robustness problem of the deep learning-based object detection methods against attacks, the deep learning approach based on grayscale conversion and discrete wavelet transform is proposed. The proposed approach uses well-known Faster R-CNN, YOLOv5, and DETR as deep learning models. The performance of the proposed approach using grayscale and DWT-based data augmentation has been evaluated on the natural scene data set containing street signs for object detection on attacked images. Against FGSM attack with 0.10 e , f1 scores of Faster R-CNN, YOLOv5, and DETR models have increased by 0.18, 0.10, and 0.59, respectively. Also, against PGD attack with 0.10 e , f1 scores of models have increased, respectively, by 0.19, 0.23, and 0.63 with the proposed technique. In addition, the performance of the proposed approach has also been evaluated on the open dataset taken from Kaggle. On the other hand, the memory size of the images processed by the models and the run times of deep learning models has decreased.