ODRP: a new approach for spatial street sign detection from EXIF using deep learning-based object detection, distance estimation, rotation and projection system


Taşyürek M.

VISUAL COMPUTER, cilt.40, sa.2, ss.983-1003, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 40 Sayı: 2
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1007/s00371-023-02827-9
  • Dergi Adı: VISUAL COMPUTER
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED)
  • Sayfa Sayıları: ss.983-1003
  • Anahtar Kelimeler: Geographical information systems, Spatial detection, Street plate, Exchangeable image file, Spatial estimation with deep learning
  • Kayseri Üniversitesi Adresli: Evet

Özet

Geographical information systems (GIS) are the systems where spatial data are stored and analyzed. The most important raw material in GIS is spatial data. Thus, it is essential to collect and update these data. On the other hand, exchangeable image file (EXIF) format is a special file format that contains camera direction, date-time information and GPS location provided by a digital camera that captures the images. Transferring the objects in EXIF data sets with absolute coordinates on the earth significantly contributes to GIS. In this study, a new hybrid approach, ODPR, which utilizes object detection (O), distance estimation (D), rotation (R) and projection (P) methods, is proposed to detect street sign objects in EXIF with their locations. The performance of the proposed approach has been examined on the natural EXIF data sets obtained from the Kayseri Metropolitan Municipality. In the proposed approach, a deep learning method detects a street sign object in the EXIF. Then, the object’s distance is calculated at the point where the photograph is taken. Finally, the spatial location of the detected object on the earth is calculated using distance, direction and GPS data with rotation and projection methods. In the proposed ODRP approach, the performances of convolutional neural network (CNN)-based Faster R-CNN, YOLO V5, YOLO V6 and transformer-based DETR models as deep learning models for object detection are examined. The F1 score metric is widely used to examine the performance of methods in deep learning models. The performances of the proposed approaches are reviewed according to the F1 score values, and ODRP Faster R-CNN, YOLO V5, YOLO V6 and DETR approaches achieved F1 scores of 0.909, 0.956, 0.948 and 0.922, respectively. In addition, to overcome the variability of light and background mixing problems, an improved supervised learning method (ISL) is proposed. Thanks to ISL, ODRP Faster R-CNN, YOLO V5, YOLO V6, and DETR approaches have reached 0.965, 0.985, 0.969 and 0.942 f1 scores, respectively. The proposed ODRP Faster R-CNN, YOLO V5, YOLO V6 and DETR approaches found the location of the street sign object to be 11434.76, 12818.39, 12454.63 and 9843.57 ms closer to its position on earth than the classical method, which considers the location of the EXIF, respectively. Regarding time cost, the ODRP Faster R-CNN, YOLO V5, YOLO V6 and DETR analyze EXIF data at an average of 0.99, 0.42, 0.41 and 0.53 s, respectively. The run time of the ODRP YOLO V5 and V6 approaches is almost equal to each other, and it works approximately 2.5 times faster than the ODRP Faster R-CNN method. Consequently, ODRP YOLO V5 outperforms ODRP Faster R-CNN, YOLO V6 and DETR for detecting the spatial location of street sign objects in EXIF and the F1 score.