Automatic title generation for Chinese spoken documents with a delicate scored Viterbi algorithm

Author

Kong, Sheng-Yi ; Wang, Chien-chi ; Kuo, Ko-chien ; Lee, Lin-shan

Author_Institution

Speech Lab., Nat. Taiwan Univ., Taipei

fYear

2008

fDate

15-19 Dec. 2008

Firstpage

165

Lastpage

168

Abstract

Automatic title generation for spoken documents is believed to be an important key for browsing and navigation over huge quantities of multimedia content. A new framework of automatic title generation for Chinese spoken documents is proposed in this paper using a delicate scored Viterbi algorithm performed over automatically generated text summaries of the testing spoken documents. The Viterbi beam search is guided by a delicate score evaluated from three sets of models: term selection model tells the most suitable terms to be included in the title, term ordering model gives the best ordering of the terms to make the title readable, and title length model tells the reasonable length of the title. The models are trained from a training corpus which is not required to be matched with the testing spoken documents. Both objective evaluation based on F1 measure and subjective human evaluation for relevance and readability indicated the approach is very attractive.

Keywords

maximum likelihood estimation; natural language processing; speech processing; text analysis; Chinese spoken document; Viterbi beam search; automatic title generation; delicate scored Viterbi algorithm; term ordering; term selection; text summary; title length; Anthropometry; Automatic testing; Bandwidth; Educational institutions; Humans; IP networks; Navigation; Performance evaluation; Speech; Viterbi algorithm; Spoken documents; title generation;

fLanguage

English

Publisher

ieee

Conference_Titel

Spoken Language Technology Workshop, 2008. SLT 2008. IEEE

Conference_Location

Goa

Print_ISBN

978-1-4244-3471-8

Electronic_ISBN

978-1-4244-3472-5

Type

conf

DOI

10.1109/SLT.2008.4777866

Filename

4777866