Title :
Effectiveness of multiscale fractal dimension for improvement of frame classification rate
Author :
Mohammadi Zaki;Nirmesh J. Shah;Hemant A. Patil
Author_Institution :
Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT), Gandhinagar - 382007, India
Abstract :
We propose to use multiscale fractal dimension (FD)-based features for phoneme classification task at frame-level. During speech production, turbulence is created and hence vortices (generated due to presence of separated airflow) may travel along the vocal tract and excite vocal tract resonators. This turbulence and in effect, the embedded features of different phoneme classes, can be captured by invariant property of multiscale FD. To capture complementary information, feature-level fusion of proposed feature with state-of-the-art Mel Frequency Cepstral Coefficients (MFCC) is attempted and found to be effective. In particular, single-hidden layer neural nets were trained to compute the frame classification rate. Proposed feature was able to reduce the error rate by over 1.6 % from MFCC features on TIMIT database. This is supported by significant reduction in % EER (i.e., 0.327 % to 4.795 %)1.
Keywords :
"Fractals","Speech","Mel frequency cepstral coefficient","Production","Databases","Neural networks","Europe"
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2015 23rd European
Electronic_ISBN :
2076-1465
DOI :
10.1109/EUSIPCO.2015.7362537