Performance of Speaker Verification Using CSM and TM
DOI:
https://doi.org/10.51983/ajcst-2018.7.2.1866Keywords:
Autoassociative neural network, of Relative Spec-tral Transform-Perceptual Linear Prediction (RASTA-PLP), Close Speaking Microphone, Throat microphoneAbstract
In this paper, we presented the performance of a speaker verification system based on features computed from the speech recorded using a Close Speaking Microphone(CSM) and Throat Microphone(TM) in clean and noisy environment. Noise is the one of the most complicated problem in speaker verification system. The background noises affect the performance of speaker verification using CSM. To overcome this issue, TM is used which has a transducer held at the throat resulting in a clean signal and unaffected by background noises. Acoustic features are computed by means of Relative Spectral Transform-Perceptual Linear Prediction (RASTA-PLP). Autoassociative neural network (AANN) technique is used to extract the features and in order to confirm the speakers from clean and noisy environment. A new method is presented in this paper, for verification of speakers in clean using combined CSM and TM. The verification performance of the proposed combined system is significantly better than the system using the CSM alone due to the complementary nature of CSM and TM. It is evident that an EER of about 1.0% for the combined devices (CSM+TM) by evaluating the FAR and FRR values and the overall verification of 99% is obtained in clean speech.
References
Y. Yujin, P. Zhao, and Q. Zhou, "Research of speaker recognition based on combination of LPCC and MFCC," in IEEE, 2010.
D. O'Shaughnessy, Speech Communications A Human and Machine, Universities Press (India) Limited, 2001.
R. P. Ramachandran, K. R. Farrell, and R. Ramachandran, "Speaker recognition – general classifier approaches and data fusion methods," in Pattern Recognition, vol. 35, pp. 2801–2821, December 2002.
A. S. Nigade and J. S. Chitode, "Throat Microphone Signals for Isolated Word Recognition Using LPC," in International Journal of Advanced Research in Computer Science and Software Engineering, vol. 2, no. 8, August 2012.
A. Shahina, B. Yegnanarayanan, and M. R. Kesheorey, "Throat microphone signal for speaker recognition," in Proc. Int. Conf. Spoken Language Processing, 2004.
L. Zhu and Q. Yang, "Speaker Recognition System Based On Weighted Feature Parameter," in International conference on solid-state devices and materials science, pp. 1515-1522, 2012.
H. Hermansky, "Perceptual linear predictive (PLP) analysis for speech," J. Acoustic Soc. Am., pp. 1738–1752, 1990.
H. Hermansky and N. Morgan, "Rasta processing of speech," in IEEE Trans. On Speech and Audio Processing, vol. 2, pp. 578–589, 1994.
L. Galotto, J. O. P. Pinto, L. C. Leite, L. E. B. da Silva, and B. K. Bose, "Evaluation of the auto-associative neural network based sensor compensation in drive systems," in IEEE Industry Applications Society Annual Meeting, pp. 1–6, October 2008.
P. Dhanalakshmi, S. Palanivel, and V. Ramalingam, "Classification of audio signals using AANN and GMM," in Applied Soft Computing, vol. 11, no. 10, pp. 716–723, January 2011.
S. Jothilakshmi, "Spoken keywords detection using autoassociative neural networks," in Springer-International Journal of Speech Technology, pp. 83–89, August 2014.
S. Benesty and Huang, Text-dependent Speaker Recognition, 2008.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2018 The Research Publication
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.