• DocumentCode
    2659889
  • Title

    Automatic title generation for Chinese spoken documents with a delicate scored Viterbi algorithm

  • Author

    Kong, Sheng-Yi ; Wang, Chien-chi ; Kuo, Ko-chien ; Lee, Lin-shan

  • Author_Institution
    Speech Lab., Nat. Taiwan Univ., Taipei
  • fYear
    2008
  • fDate
    15-19 Dec. 2008
  • Firstpage
    165
  • Lastpage
    168
  • Abstract
    Automatic title generation for spoken documents is believed to be an important key for browsing and navigation over huge quantities of multimedia content. A new framework of automatic title generation for Chinese spoken documents is proposed in this paper using a delicate scored Viterbi algorithm performed over automatically generated text summaries of the testing spoken documents. The Viterbi beam search is guided by a delicate score evaluated from three sets of models: term selection model tells the most suitable terms to be included in the title, term ordering model gives the best ordering of the terms to make the title readable, and title length model tells the reasonable length of the title. The models are trained from a training corpus which is not required to be matched with the testing spoken documents. Both objective evaluation based on F1 measure and subjective human evaluation for relevance and readability indicated the approach is very attractive.
  • Keywords
    maximum likelihood estimation; natural language processing; speech processing; text analysis; Chinese spoken document; Viterbi beam search; automatic title generation; delicate scored Viterbi algorithm; term ordering; term selection; text summary; title length; Anthropometry; Automatic testing; Bandwidth; Educational institutions; Humans; IP networks; Navigation; Performance evaluation; Speech; Viterbi algorithm; Spoken documents; title generation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop, 2008. SLT 2008. IEEE
  • Conference_Location
    Goa
  • Print_ISBN
    978-1-4244-3471-8
  • Electronic_ISBN
    978-1-4244-3472-5
  • Type

    conf

  • DOI
    10.1109/SLT.2008.4777866
  • Filename
    4777866