• DocumentCode
    1868448
  • Title

    Automatic identification of broadcast news story boundaries using the unification method for popular nouns

  • Author

    Khalaf, Zainab Ali ; Tan Tien Ping

  • Author_Institution
    Sch. of Comput. Sci., Univ. Sains Malaysia USM, Minden, Malaysia
  • fYear
    2013
  • fDate
    8-11 Sept. 2013
  • Firstpage
    577
  • Lastpage
    584
  • Abstract
    Herein we describe the latent semantic algorithm method for identifying broadcast news story boundaries. The proposed system uses the pronounced forms of words to identify story boundaries based on popular noun unification. Commonly used clustering methods use latent semantic analysis (LSA) because of its excellent performance and because it is based on deep semantic rather than shallow principles. In this study, the LSA algorithm with and without unification was used to identify boundaries of Malay spoken broadcast news stories. The LSA algorithm with the noun unification approach resulted in less error and better performance than the LSA algorithm without noun unification.
  • Keywords
    document handling; information resources; LSA algorithm; Malay spoken broadcast news stories; automatic identification; broadcast news story boundaries; latent semantic algorithm method; latent semantic analysis; popular noun unification; unification method; Algorithm design and analysis; Gold; Matrix decomposition; Natural language processing; Semantics; Speech; Writing; broadcast news; latent semantic analysis; spoken document; story boundary identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Systems (FedCSIS), 2013 Federated Conference on
  • Conference_Location
    Krako??w
  • Type

    conf

  • Filename
    6644059