Title :
An initial prototype system for Chinese spoken document understanding and organization for indexing/browsing and retrieval applications
Author :
Lee, Lin-shan ; Chen, Shun-Chuan ; Ho, Yuan ; Chen, Jia-Fu ; Li, Ming-Han ; Li, Te-Hsuan
Author_Institution :
Nat. Taiwan Univ., Taipei, Taiwan
Abstract :
The most attractive form of future network content will be multimedia. When voice information is included, it usually carries core concepts for the content. Thus, a spoken document associated with multimedia content can very possibly serve as the key for indexing/browsing and retrieval. However, unlike written documents, multimedia or voice information is very often just audio/video signals. They are very difficult to index, browse or retrieve, since users cannot go through each of them from the beginning to the end during browsing. A possible approach may be to segment the audio/video signals automatically into short paragraphs, each with a central concept or topic, and then automatically generate a title and/or a summary for each of these, in either speech or text form. The topics and central concepts described in the segmented short paragraphs may then be further analyzed and organized into graphic structures describing the relationships among these topics and central concepts. Hence, the multimedia content can be automatically indexed much more efficiently and browsed and retrieved by the user based on the title, summary and graphic structure. We refer to this as the understanding and organization of spoken documents. An initial prototype system for such functions, with broadcast news taken as the example multimedia content, is presented. The graphic structure used to describe the relationships among the topics and central concepts are 2-dimensional tree structures developed based on probabilistic latent semantic analysis.
Keywords :
content-based retrieval; information networks; multimedia databases; speech processing; speech recognition; speech synthesis; statistical analysis; text analysis; tree data structures; 2-dimensional tree structures; 2D tree structures; Chinese spoken document understanding; browsing applications; central concepts; indexing applications; information network; multimedia content; probabilistic latent semantic analysis; retrieval applications; text-to-speech synthesis; topics; voice information; Content based retrieval; Digital multimedia broadcasting; Graphics; Indexing; Multimedia communication; Multimedia systems; Prototypes; Signal generators; Speech; Tree data structures;
Conference_Titel :
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN :
0-7803-8678-7
DOI :
10.1109/CHINSL.2004.1409653