Title :
Automatic identification of broadcast news story boundaries using the unification method for popular nouns
Author :
Khalaf, Zainab Ali ; Tan Tien Ping
Author_Institution :
Sch. of Comput. Sci., Univ. Sains Malaysia USM, Minden, Malaysia
Abstract :
Herein we describe the latent semantic algorithm method for identifying broadcast news story boundaries. The proposed system uses the pronounced forms of words to identify story boundaries based on popular noun unification. Commonly used clustering methods use latent semantic analysis (LSA) because of its excellent performance and because it is based on deep semantic rather than shallow principles. In this study, the LSA algorithm with and without unification was used to identify boundaries of Malay spoken broadcast news stories. The LSA algorithm with the noun unification approach resulted in less error and better performance than the LSA algorithm without noun unification.
Keywords :
document handling; information resources; LSA algorithm; Malay spoken broadcast news stories; automatic identification; broadcast news story boundaries; latent semantic algorithm method; latent semantic analysis; popular noun unification; unification method; Algorithm design and analysis; Gold; Matrix decomposition; Natural language processing; Semantics; Speech; Writing; broadcast news; latent semantic analysis; spoken document; story boundary identification;
Conference_Titel :
Computer Science and Information Systems (FedCSIS), 2013 Federated Conference on
Conference_Location :
Krako??w