DocumentCode
190157
Title
Experiments on automatic language identification for philippine languages using acoustic Gaussian Mixture Models
Author
Laguna, Ann Franchesca ; Guevara, Rowena Cristina
Author_Institution
Digital Signal Process. Lab., Univ. of the Philippines, Diliman, Quezon City, Philippines
fYear
2014
fDate
14-16 April 2014
Firstpage
657
Lastpage
662
Abstract
A Philippine LID system has not been previously created because of the limited amount of recorded speech data. This research initiates the LID research using the Philippine Language Database (PLD) collected by the Digital Signal Processing Laboratory of the University of the Philippines Diliman (DSP-UPD). Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Shifted Delta Cepstra (SDC) and Linear Predictive Cepstral Coefficients (LPCC) features are extracted from the speech segments. Gaussian Mixture Model (GMM) using Expectation Maximization (EM) and Universal Background Model (UBM) approach is used to model the acoustic characteristics of the language. Maximum a Posteriori (MAP) probability is then used to determine the language of a speech utterance based on the language GMMs. PLP using a 16 Mixture GMM-EM has been found to produce the best performance among the four feature vectors in discriminating the languages.
Keywords
Gaussian processes; acoustic signal processing; audio databases; cepstral analysis; expectation-maximisation algorithm; mixture models; natural language processing; probability; speech processing; DSP-UPD; EM; GMM; LPCC features; MAP probability; MFCC; Mel frequency cepstral coefficients; PLD; PLP; Philippine LID system; Philippine Language Database; Philippine languages; SDC; UBM approach; University of the Philippines Diliman; acoustic Gaussian mixture models; automatic language identification; digital signal processing laboratory; expectation maximization; feature vectors; language acoustic characteristics; linear predictive cepstral coefficients; maximum a posteriori probability; perceptual linear prediction; shifted Delta cepstra; speech segments; speech utterance language; universal background model; Accuracy; Feature extraction; Gaussian mixture model; Mel frequency cepstral coefficient; Speech; Acoustic; GMM; Language Identification; Philippine Languages;
fLanguage
English
Publisher
ieee
Conference_Titel
Region 10 Symposium, 2014 IEEE
Conference_Location
Kuala Lumpur
Print_ISBN
978-1-4799-2028-0
Type
conf
DOI
10.1109/TENCONSpring.2014.6863115
Filename
6863115
Link To Document