مرکز منطقه ای اطلاع رساني علوم و فناوري - Kernel topic segmentation for informal multi-party meetings and performance degradation caused by insufficient lexicon

DocumentCode :

2330801

Title :

Kernel topic segmentation for informal multi-party meetings and performance degradation caused by insufficient lexicon

Author :

Sadohara, Ken

Author_Institution :

Nat. Inst. of Adv. Ind. Sci. & Technol. (AIST), Tsukuba, Japan

fYear :

2010

fDate :

12-15 Dec. 2010

Firstpage :

430

Lastpage :

435

Abstract :

We herein propose a domain-independent topic segmentation algorithm for free-form multi-party meeting recordings. The advantage of the proposed algorithm is that topical and lexical knowledge, which are difficult to adapt to the target meeting before speech recognition and topic segmentation, are not required. For an errorful sequence of phonemes obtained using a continuous phoneme recognizer, the proposed algorithm exhaustively analyzes the occurrence pattern of subsequences of phonemes and partitions the sequence into segments with coherent patterns. An empirical study on the ICSI Meeting Corpus has indicated that it performs comparably to lexical-cohesion-based text segmenters applied to human transcripts. Furthermore, the performance of the text segmenters applied to LVCSR output decreases significantly when keywords are not included in the lexicon. This suggests that, for the purpose of obtaining topical structure, the phoneme sequence segmenter could be more robust than text segmenters with LVCSR.

Keywords :

speech recognition; Kernel topic segmentation; informal multiparty meetings; insufficient lexicon; lexical knowledge; multiparty meeting recordings; performance degradation; speech recognition; text segmenters; topical knowledge; Topic segmentation; kernel method; meeting summarization; string kernel; sub-word recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language Technology Workshop (SLT), 2010 IEEE

Conference_Location :

Berkeley, CA

Print_ISBN :

978-1-4244-7904-7

Electronic_ISBN :

978-1-4244-7902-3

Type :

conf

DOI :

10.1109/SLT.2010.5700891

Filename :

5700891

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2330801