Singer identification model using data augmentation and enhanced feature conversion with hybrid feature vector and machine learning

Hızlısoy, SERHAT; Arslan, RECEP; Çolakoğlu, EMEL

doi:10.1186/s13636-024-00336-8

Singer identification model using data augmentation and enhanced feature conversion with hybrid feature vector and machine learning

Atıf İçin Kopyala

Hızlısoy S., Arslan R. S., Çolakoğlu E.

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, cilt.2024, sa.1, 2024 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 2024 Sayı: 1
Basım Tarihi: 2024
Doi Numarası: 10.1186/s13636-024-00336-8
Dergi Adı: EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, Music Index, Directory of Open Access Journals, Civil Engineering Abstracts
Kayseri Üniversitesi Adresli: Evet

Özet

Analyzing songs is a problem that is being investigated to aid various operations on music access platforms. At the beginning of these problems is the identification of the person who sings the song. In this study, a singer identification application, which consists of Turkish singers and works for the Turkish language, is proposed in order to find a solution to this problem. Mel-spectrogram and octave-based spectral contrast values are extracted from the songs, and these values are combined into a hybrid feature vector. Thus, problem-specific situations such as determining the differences in the voices of the singers and reducing the effects of the year and album differences on the result are discussed. As a result of the tests and systematic evaluations, it has been shown that a certain level of success has been achieved in the determination of the singer who sings the song, and that the song is in a stable structure against the changes in the singing style and song structure. The results were analyzed in a database of 9 singers and 180 songs. An accuracy value of 89.4% was obtained using the reduction of the feature vector by PCA, the normalization of the data, and the Extra Trees classifier. Precision, recall and f-score values were 89.9%, 89.4% and 89.5%, respectively.