Title :
Call transcript segmentation using word cooccurrence model
Author :
Ikbal, Shajith ; Visweswariah, Karthik
Author_Institution :
IBM Res., Bangalore, India
Abstract :
In this paper, we propose a word cooccurrence model to perform topic segmentation of call center conversational speech. This model is estimated from training data to discriminatively represent how likely various pairs of words are to cooccur within homogeneous topic segments. We show that such model provide an effective measure of lexical cohesion and hence provide useful evidence of topical coherence or lack thereof between various parts of the call transcripts. We propose two approaches of utilizing such evidence for segmentation: 1) An efficient dynamic programming algorithm to perform segmentation simply utilizing the word cooccurrence model. 2) Extracting features based on word cooccurrence model to utilize them as additional features in conditional random field (CRF) based segmentation. Experimental evaluation of these approaches against state-of-the-art approaches show the effectiveness of word cooccurrence model for the topic segmentation task.
Keywords :
call centres; speech processing; word processing; call center conversational speech; call transcript segmentation; conditional random field; lexical cohesion measurement; topic segmentation; topical coherence; word cooccurrence model; complementary features; conditional random field; dynamic programming algorithm; topic segmentation; word cooccurrence;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-7904-7
Electronic_ISBN :
978-1-4244-7902-3
DOI :
10.1109/SLT.2010.5700881