Title :
Code-Switching Event Detection by Using a Latent Language Space Model and the Delta-Bayesian Information Criterion
Author :
Chung-Hsien Wu ; Han-Ping Shen ; Chun-Shan Hsu
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
Abstract :
This paper proposes a new paradigm for code-switching event detection based on latent language space models (LLSMs) and the delta-Bayesian information criterion (ΔBIC). A phone-based Mandarin-English speech recognizer was first employed for obtaining the senone sequence of a speech utterance. For each senone, acoustic features and the posterior probability of the articulatory features (AFs) were extracted and applied to an eigenspace transformation, based on principal component analysis (PCA). Latent semantic analysis (LSA) was then adopted for constructing a matrix to model the importance of each principal component in the eigenspace for the senones and AFs in each language. The spatial relationships among the senones (or AFs) represented by the PCA-transformed eigenvalues in the LSA-based matrix were employed to construct an LLSM for characterizing a language. In code-switching event detection, the language likelihood between the input speech LLSM and each of the language-dependent LLSMs was estimated. The Euclidian-distance-based similarities and cosine-angle-distance-based similarities were adopted for estimating the language likelihood for senones and AFs. The ΔBIC was then used for estimating the language transition score for each hypothesized code-switching event. Finally, the dynamic programming algorithm was employed for obtaining the most likely code-switching language sequence. The proposed approach was evaluated using a Mandarin-English code-switching speech database and outperformed other conventional methods. A duration accuracy of 72.45% can be obtained from the proposed system with optimized parameters.
Keywords :
dynamic programming; eigenvalues and eigenfunctions; feature extraction; natural language processing; principal component analysis; speech recognition; Euclidian distance based similarities; Latent semantic analysis; Mandarin-English code-switching speech database; acoustic features; articulatory features extraction; code switching event detection; code switching language sequence; cosine-angle-distance based similarities; delta-Bayesian information criterion; dynamic programming algorithm; eigenspace transformation; language likelihood estimation; language transition score; latent language space models; phone based Mandarin-English speech recognizer; posterior probability; principal component analysis; senone sequence; speech utterance; Acoustics; Event detection; Feature extraction; Hidden Markov models; Speech; Speech processing; Speech recognition; Articulatory features; code-switching event detection; delta-Bayesian information criterion ($Delta {hbox {BIC}}$); latent language space model; senones;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2015.2456417