Title :
Improved lattice rescoring by using speech attributes in Large Vocabulary Continuous Speech Recognition systems
Author :
Xinglong Gao ; Qingqing Zhang ; Jielin Pan
Author_Institution :
Univ. of Chinese cademy of Sci., Beijing, China
Abstract :
Acoustic modeling of Large Vocabulary Continuous Speech Recognition (LVCSR) system which is normally based on context-dependent phone is heavily limited by representative capability between transcriptions and corresponding variation of raw speech utterance. To describe this relationship more accurate, this paper presents an alternative strategy by which speech attributes are used to capture acoustic characteristics to improve performances of LVCSR. Validations on a series of relevant experiments, and it is proven that the speech attributes can be used as complementary knowledge resources that can bring more abundant information than basic phone based system. Hence, speech attribute information is used to be integrated into phone based LVCSR system during lattice rescoring. For both reading and Conversional Telephone Speech (CTS) style LVCSR tasks, experimental results showed that the combined system reduced Word Error Rate (WER) by about 3-5% relatively.
Keywords :
matrix algebra; speech recognition; CTS; LVCSR system; WER; acoustic characteristics; complementary knowledge resources; context-dependent phone; conversional telephone speech; improved lattice rescoring; large vocabulary continuous speech recognition systems; raw speech utterance variation; representative capability; speech attribute information; word error rate; Acoustics; Detectors; Hidden Markov models; Lattices; Speech; Speech processing; Speech recognition;
Conference_Titel :
Image and Signal Processing (CISP), 2013 6th International Congress on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4799-2763-0
DOI :
10.1109/CISP.2013.6743974