Title :
Automatic detection of voice impairments due to vocal misuse by means of Gaussian mixture models
Author :
Godino-Llorente, Juan I. ; Aguilera-Navarro, Santiago ; Gomez-Vilda, Pedro
Author_Institution :
Lab. de Tecnologia de Rehabilitacion, Univ. Politecnica de Madrid, Spain
Abstract :
There is an increasing risk of vocal and voice diseases due to the modern way of life. It is well known that most of the vocal and voice diseases cause changes in the acoustic voice signal. These diseases have to be diagnosed and treated at an early stage. Acoustic analysis is a non-invasive technique based on digital processing of speech signal. Acoustic analysis could be a useful tool to diagnose this kind of diseases, furthermore it presents several advantages: it is a non-invasive tool, provides an objective diagnostic, moreover, it can be used for the evaluation of surgical and pharmacological treatments and rehabilitation processes. ENT clinicians use acoustic voice analysis to characterise pathological voices. In this paper, we study a well known classification approach-in speaker recognition and identification-applied to the automatic detection of voice disorders. Former and actual works demonstrate that impaired voice detection can be carried out by means of supervised neural nets: multilayer perceptron. We have focused our task in detection of impaired voices by means of Gaussian mixture models and parameters such as mel frequency cepstral coefficients extracted from the windowed voice signal.
Keywords :
Gaussian distribution; cepstral analysis; fast Fourier transforms; feature extraction; learning (artificial intelligence); maximum likelihood estimation; medical signal processing; multilayer perceptrons; speech; speech processing; Gaussian mixture models; acoustic voice analysis; automatic detection; feature extraction; learning algorithm; likelihood ratio test; mel frequency coefficients; multilayer perceptron; parameter representation; pathological voices; probability of correct detection; short-time FFT; speaker identification; speaker recognition; supervised neural nets; temporal order; vocal diseases; vocal misuse; voice impairments; windowed voice signal; Acoustic signal detection; Diseases; Neural networks; Pathology; Signal analysis; Signal processing; Speaker recognition; Speech analysis; Speech processing; Surgery;
Conference_Titel :
Engineering in Medicine and Biology Society, 2001. Proceedings of the 23rd Annual International Conference of the IEEE
Print_ISBN :
0-7803-7211-5
DOI :
10.1109/IEMBS.2001.1020549