Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Abstract :
Compared with continuous density hidden Markov model (CDHMM), discrete HMM (DHMM) has inherent attractive properties: it takes only O(1) time to get the state output probability, and the discrete features, compared with cepstral coefficients, could be encoded in fewer bits, lowering the bandwidth requirement in distributed speech recognition architecture. Unfortunately, the recognition performance of conventional DHMM is significantly worse than that of CDHMM due to the large quantization error and the use of multiple independent streams. One way to reduce the quantization error and to improve the recognition accuracy, is to use a very large codebook. However, the training data is usually not sufficient to robustly train such high density DHMM (HDDHMM). In this paper, we investigate a subspace approach together with sub-vector quantization to solve the training problem of HDDHMM. The resulting model is called subspace HDDHMM (SHDDHMM). On both Resource Management and Wall Street Journal 5K-vocabulary task, when compared with its CDHMM counterpart, SHDDHMM shows comparable performance in recognition accuracy, with faster decoding speed and lower bandwidth requirement.
Keywords :
hidden Markov models; probability; speech coding; speech recognition; CDHMM; SHDDHMM; Wall Street journal 5K-vocabulary task; automatic speech recognition; cepstral coefficients; continuous density hidden Markov model; decoding speed; discrete features; distributed speech recognition architecture; quantization error reduction; resource management; state output probability; subspace high density DHMM; subspace high-density discrete hidden Markov model; Acoustics; Computational modeling; Decoding; Hidden Markov models; Quantization; Speech recognition; Vectors; high-density discrete HMM; semi-continuous HMM; subspace modeling; subvector quantization;