DocumentCode :
2770494
Title :
Analytical comparison between position specific posterior lattices and confusion networks based on words and subword units for spoken document indexing
Author :
Pan, Yi-Cheng ; Chang, Hung-lin ; Lee, Lin-shan
Author_Institution :
Nat. Taiwan Univ., Taipei
fYear :
2007
fDate :
9-13 Dec. 2007
Firstpage :
677
Lastpage :
682
Abstract :
In this paper we analytically compare the two widely accepted approaches of spoken document indexing, position specific posterior lattices (PSPL) and confusion network (CN), in terms of retrieval accuracy and index size. The fundamental distinctions between these two approaches in terms of construction units, posterior probabilities, number of clusters, indexing coverage and space requirements are discussed in detail. A new approach to approximate subword posterior probability in a word lattice is also incorporated in PSPL/CN to handle OOV/rare word problems, which were unaddressed in original PSPL and CN approaches. Extensive experimental results on Chinese broadcast news segments indicate that PSPL offers higher accuracy than CN but requiring much larger disk space, while subword-based PSPL turns out to be very attractive because it lowers the storage cost while offers even higher accuracies.
Keywords :
indexing; information retrieval; speech recognition; Chinese broadcast news segments; confusion networks; position specific posterior lattices; retrieval accuracy; spoken document indexing; sub word units; subword posterior probability; Automatic speech recognition; Broadcasting; Computer science; Content based retrieval; Costs; Indexing; Information analysis; Information retrieval; Internet; Lattices; PSPL; S-PSPL; Spoken Document Retrieval;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
Type :
conf
DOI :
10.1109/ASRU.2007.4430193
Filename :
4430193
Link To Document :
بازگشت