A comprehensive survey on optimizing deep learning models by metaheuristics


AKAY B., KARABOĞA D., AKAY R.

Artificial Intelligence Review, cilt.55, sa.2, ss.829-894, 2022 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 55 Sayı: 2
  • Basım Tarihi: 2022
  • Doi Numarası: 10.1007/s10462-021-09992-0
  • Dergi Adı: Artificial Intelligence Review
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, Educational research abstracts (ERA), Index Islamicus, INSPEC, Library and Information Science Abstracts, Library, Information Science & Technology Abstracts (LISTA), Metadex, Psycinfo, zbMATH, Civil Engineering Abstracts
  • Sayfa Sayıları: ss.829-894
  • Anahtar Kelimeler: Deep neural networks, Metaheuristics, Training, Hyper-parameter optimization, Architecture optimization, Feature extraction, CONVOLUTIONAL NEURAL-NETWORKS, BELIEF NETWORKS, HYPERPARAMETER OPTIMIZATION, HYPER-PARAMETERS, RECOGNITION, ARCHITECTURES, ALGORITHM, CLASSIFICATION, ENSEMBLE, SEARCH
  • Kayseri Üniversitesi Adresli: Hayır

Özet

© 2021, The Author(s), under exclusive licence to Springer Nature B.V.Deep neural networks (DNNs), which are extensions of artificial neural networks, can learn higher levels of feature hierarchy established by lower level features by transforming the raw feature space to another complex feature space. Although deep networks are successful in a wide range of problems in different fields, there are some issues affecting their overall performance such as selecting appropriate values for model parameters, deciding the optimal architecture and feature representation and determining optimal weight and bias values. Recently, metaheuristic algorithms have been proposed to automate these tasks. This survey gives brief information about common basic DNN architectures including convolutional neural networks, unsupervised pre-trained models, recurrent neural networks and recursive neural networks. We formulate the optimization problems in DNN design such as architecture optimization, hyper-parameter optimization, training and feature representation level optimization. The encoding schemes used in metaheuristics to represent the network architectures are categorized. The evolutionary and selection operators, and also speed-up methods are summarized, and the main approaches to validate the results of networks designed by metaheuristics are provided. Moreover, we group the studies on the metaheuristics for deep neural networks based on the problem type considered and present the datasets mostly used in the studies for the readers. We discuss about the pros and cons of utilizing metaheuristics in deep learning field and give some future directions for connecting the metaheuristics and deep learning. To the best of our knowledge, this is the most comprehensive survey about metaheuristics used in deep learning field.