Text independent speaker recognition based on MFCC and machine learning

Creative Commons License

Hızlısoy S., Arslan R. S.

SELCUK UNIVERSITY JOURNAL OF ENGINEERING SCIENCES, vol.20, no.3, pp.73-78, 2021 (Peer-Reviewed Journal)


Speaker recognition (SR) is the process of recognizing the voice of human from a group of speech samples with artificial intelligence. SR models are used in various human-voice based security platforms and authentication problems. In this paper, a text-independent speaker recognition model was developed for the problem with 60 different speakers. Obtaining the distinctive features of speaker expressions during the model design phase is an important point. In this study, the MFCC algorithm, which is the most common method used to obtain short-time features, is used to extract features of speech signals. The classification performance of the proposed model and commonly used 11 different machine learning methods has been evaluated on Audio-MNIST dataset, and the results were shown comparatively. As a result, 97.1% classification rate was achieved with SVM classifier. In addition, precision, recall and f-score values are 98.0%, 97.1% and 97.4%, respectively. The results show that the proposed model produces successful results for all classes and is a widely applicable approach to different types of speaker datasets.