Experiments on automatic language identification for philippine languages using acoustic Gaussian Mixture Models

Author

Laguna, Ann Franchesca ; Guevara, Rowena Cristina

Author_Institution

Digital Signal Process. Lab., Univ. of the Philippines, Diliman, Quezon City, Philippines

fYear

2014

fDate

14-16 April 2014

Firstpage

657

Lastpage

662

Abstract

A Philippine LID system has not been previously created because of the limited amount of recorded speech data. This research initiates the LID research using the Philippine Language Database (PLD) collected by the Digital Signal Processing Laboratory of the University of the Philippines Diliman (DSP-UPD). Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Shifted Delta Cepstra (SDC) and Linear Predictive Cepstral Coefficients (LPCC) features are extracted from the speech segments. Gaussian Mixture Model (GMM) using Expectation Maximization (EM) and Universal Background Model (UBM) approach is used to model the acoustic characteristics of the language. Maximum a Posteriori (MAP) probability is then used to determine the language of a speech utterance based on the language GMMs. PLP using a 16 Mixture GMM-EM has been found to produce the best performance among the four feature vectors in discriminating the languages.

Keywords

Gaussian processes; acoustic signal processing; audio databases; cepstral analysis; expectation-maximisation algorithm; mixture models; natural language processing; probability; speech processing; DSP-UPD; EM; GMM; LPCC features; MAP probability; MFCC; Mel frequency cepstral coefficients; PLD; PLP; Philippine LID system; Philippine Language Database; Philippine languages; SDC; UBM approach; University of the Philippines Diliman; acoustic Gaussian mixture models; automatic language identification; digital signal processing laboratory; expectation maximization; feature vectors; language acoustic characteristics; linear predictive cepstral coefficients; maximum a posteriori probability; perceptual linear prediction; shifted Delta cepstra; speech segments; speech utterance language; universal background model; Accuracy; Feature extraction; Gaussian mixture model; Mel frequency cepstral coefficient; Speech; Acoustic; GMM; Language Identification; Philippine Languages;

fLanguage

English

Publisher

ieee

Conference_Titel

Region 10 Symposium, 2014 IEEE

Conference_Location

Kuala Lumpur

Print_ISBN

978-1-4799-2028-0

Type

conf

DOI

10.1109/TENCONSpring.2014.6863115

Filename

6863115