Title :
Using probabilistic characteristic vector based on both phonetic and prosodic features for language identification
Author :
Hosseini, Amereei S A ; Homayounpour, M.M.
Author_Institution :
Lab. for Intell. Sound & Speech Process., Amirkabir Univ. of Technol., Tehran, Iran
Abstract :
Language identification (LID) is an important task in indexing of audio signals. This paper introduces a LID system with a generative frontend based on both phonetic and prosodic features. The generative frontend is built upon an ensemble of Gaussian densities. Half of these Gaussian densities are trained to represent elementary speech sound units and the others are trained to represent prosodic properties that both characterize a wide variety of languages. Shifted Delta Cepstral (SDC) and Pitch Contour Polynomial Approximation (PCPA) are used as feature. The backend classifier is Support Vector Machine (SVM). Several language identification experiments were conducted and the proposed improvements were evaluated using OGI-MLTS corpus. Using SVM with (Generalized Linear Discriminant Analysis) GLDS and Probabilistic Sequence Kernel (PSK) outperforms GMM where all systems are based on PCPA, and improves LID performance about 2.1% and 5.9% respectively. Furthermore, something in the region of 4% improvement was achieved by combining both phonetic and prosodic features in our four languages identification experiments.
Keywords :
Gaussian processes; natural language processing; polynomial approximation; probability; speech recognition; support vector machines; Gaussian densities; OGI-MLTS corpus; audio signal indexing; elementary speech sound units representation; generalized linear discriminant analysis; language identification; phonetic features; pitch contour polynomial approximation; probabilistic characteristic vector; probabilistic sequence kernel; prosodic features; shifted delta cepstral; support vector machine; Approximation methods; Kernel; Phase shift keying; Polynomials; Probabilistic logic; Speech; Support vector machines; APRLM; GPRLM; Language Identification; Pitch Contour Polynomial Approximation; Probabilistic Sequence Kernel; Support Vector Machine;
Conference_Titel :
Telecommunications (IST), 2010 5th International Symposium on
Conference_Location :
Tehran
Print_ISBN :
978-1-4244-8183-5
DOI :
10.1109/ISTEL.2010.5734122