Title :
Acoustic-phonetic analysis of fricatives for classification using SVM based algorithm
Author :
Frid, Alex ; Lavner, Yizhar
Author_Institution :
Dept. of Comput. Sci., Tel-Hai Coll., Tel-Hai, Israel
Abstract :
Classification of phonemes is the process of assigning a phonetic category to a short section of speech signal. It is a key stage in various applications such as Spoken Term Detection, continuous speech recognition and music to lyrics synchronization, but it can also be useful on its own, for example in the professional music industry, and for applications for the hearing impaired. In this study we present an effective algorithm for classification of one group of phonemes, namely the unvoiced fricatives, which are characterized by a relatively large amount of spectral energy in the high frequency range. The classification between individual phonemes within this group is fairly difficult due to the fact that their acoustic-phonetic characteristics are quite similar. A three-stage classification algorithm between the unvoiced fricatives is utilized. In the first, preprocessing stage, each phoneme segment is divided into consecutive non-overlapping short windowed frames, which is represented by a 15-dimensional feature vector. In the second stage a support vector machine (SVM) is trained, using radial basis kernel function and an automatic grid search for optimizing the SVM parameter. A tree-based algorithm is used in the classification stage, where the phonemes are first classified into two subgroups according to their articulation: sibilants (/s/ and /sh/) and the nonsibilants (/f/ and /th/). Each subgroup is further classified using another SVM. For the evaluation of the performance of the algorithm we used more than 11000 phonemes extracted from the TIMIT speech database. Using a majority vote for the feature vectors of the-same phoneme, the overall accuracy of 85% is obtained (91% for the subset /s/, /sh/ and /f/). These results are comparable and somewhat better than those achieved in other studies. The efficiency and robustness of the algorithm make it implementable in real time applications for the hearing impaired or in recording studios.
Keywords :
acoustic signal processing; radial basis function networks; signal classification; spectral analysis; speech processing; speech recognition; support vector machines; trees (mathematics); SVM based algorithm; SVM parameter; TIMIT speech database; acoustic-phonetic analysis; acoustic-phonetic characteristics; articulation sibilants; automatic grid search; classification stage; continuous speech recognition; feature vectors; hearing impaired; high frequency range; music to lyrics synchronization; nonsibilants; performance evaluation; phoneme segment; phonemes classification; phonetic category; professional music industry; radial basis kernel function; recording studios; spectral energy; speech signal; spoken term detection; support vector machine; three-stage classification algorithm; tree-based algorithm; unvoiced fricatives; Classification algorithms; Feature extraction; Speech; Support vector machine classification; Testing; Training;
Conference_Titel :
Electrical and Electronics Engineers in Israel (IEEEI), 2010 IEEE 26th Convention of
Conference_Location :
Eliat
Print_ISBN :
978-1-4244-8681-6
DOI :
10.1109/EEEI.2010.5662110