Singer identification model using data augmentation and enhanced feature conversion with hybrid feature vector and machine learning


Creative Commons License

Hızlısoy S., Arslan R. S., Çolakoğlu E.

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, cilt.2024, sa.1, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 2024 Sayı: 1
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1186/s13636-024-00336-8
  • Dergi Adı: EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, Music Index, Directory of Open Access Journals, Civil Engineering Abstracts
  • Kayseri Üniversitesi Adresli: Evet

Özet

Analyzing songs is a problem that is being investigated to aid various operations on music access platforms. At the beginning of these problems is the identification of the person who sings the song. In this study, a singer identification application, which consists of Turkish singers and works for the Turkish language, is proposed in order to find a solution to this problem. Mel-spectrogram and octave-based spectral contrast values are extracted from the songs, and these values are combined into a hybrid feature vector. Thus, problem-specific situations such as determining the differences in the voices of the singers and reducing the effects of the year and album differences on the result are discussed. As a result of the tests and systematic evaluations, it has been shown that a certain level of success has been achieved in the determination of the singer who sings the song, and that the song is in a stable structure against the changes in the singing style and song structure. The results were analyzed in a database of 9 singers and 180 songs. An accuracy value of 89.4% was obtained using the reduction of the feature vector by PCA, the normalization of the data, and the Extra Trees classifier. Precision, recall and f-score values were 89.9%, 89.4% and 89.5%, respectively.