Author/Authors :
Wu, Yunfeng School of Information Science and Technology - Xiamen University - Xiamen - Fujian, China , Chen, Pinnan School of Information Science and Technology - Xiamen University - Xiamen - Fujian, China , Yao, Yuchen School of Information Science and Technology - Xiamen University - Xiamen - Fujian, China , Ye, Xiaoquan School of Information Science and Technology - Xiamen University - Xiamen - Fujian, China , Xiao, Yugui School of Information Science and Technology - Xiamen University - Xiamen - Fujian, China , Liao, Lifang School of Information Science and Technology - Xiamen University - Xiamen - Fujian, China , Wu, Meihong School of Information Science and Technology - Xiamen University - Xiamen - Fujian, China , Chen, Jian Department of Rehabilitation - Zhongshan Hospital - Xiamen University - Xiamen - Fujian, China
Abstract :
Analysis of quantified voice patterns is useful in the detection and assessment of dysphonia and related phonation disorders. In
this paper, we first study the linear correlations between 22 voice parameters of fundamental frequency variability, amplitude
variations, and nonlinear measures. The highly correlated vocal parameters are combined by using the linear discriminant analysis
method. Based on the probability density functions estimated by the Parzen-window technique, we propose an interclass probability
risk (ICPR) method to select the vocal parameters with small ICPR values as dominant features and compare with the modified
Kullback-Leibler divergence (MKLD) feature selection approach. The experimental results show that the generalized logistic
regression analysis (GLRA), support vector machine (SVM), and Bagging ensemble algorithm input with the ICPR features
can provide better classification results than the same classifiers with the MKLD selected features. The SVM is much better at
distinguishing normal vocal patterns with a specificity of 0.8542. Among the three classification methods, the Bagging ensemble
algorithm with ICPR features can identify 90.77% vocal patterns, with the highest sensitivity of 0.9796 and largest area value of
0.9558 under the receiver operating characteristic curve. The classification results demonstrate the effectiveness of our feature
selection and pattern analysis methods for dysphonic voice detection and measurement.
Keywords :
Analysis , Minimum , Selection , Risk