• DocumentCode
    2406912
  • Title

    Unsupervised spoken term detection with acoustic segment model

  • Author

    Wang, Haipeng ; Lee, Tan ; Leung, Cheung-Chi

  • Author_Institution
    Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
  • fYear
    2011
  • fDate
    26-28 Oct. 2011
  • Firstpage
    106
  • Lastpage
    111
  • Abstract
    This paper describes a study on query-by-example spoken term detection (STD) using the acoustic segment modeling technique. Acoustic segment models (ASMs) are a set of hidden Markov models (HMM) that are obtained in an unsupervised manner without using any transcription information. The training of ASMs follows an iterative procedure, which consists of the steps of initial segmentation, segments labeling, and HMM parameter estimation. The ASMs are incorporated into a template-matching framework for query-by-example STD. Both the spoken query examples and the test utterances are represented by frame-level ASM posteriorgrams. Segmental dynamic time warping (DTW) is applied to match the query with the test utterance and locate the possible occurrences. The performance of the proposed approach is evaluated with different DTW local distance measures on the TIMIT and the Fisher Corpora respectively. Experimental results show that the use of ASM posteriorgrams leads to consistently better performance of detection than the conventional GMM posteriorgrams.
  • Keywords
    database management systems; hidden Markov models; parameter estimation; query processing; DTW local distance measures; Fisher Corpora; HMM parameter estimation; TIMIT; acoustic segment modeling technique; frame-level ASM posteriorgrams; hidden Markov models; initial segmentation; iterative procedure; query-by-example spoken term detection; segmental dynamic time warping; segments labeling; template-matching framework; test utterance; Acoustic Segment Model; Posteriorgram; Query-by-Example; Unsupervised Spoken Term Detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Speech Database and Assessments (Oriental COCOSDA), 2011 International Conference on
  • Conference_Location
    Hsinchu
  • Print_ISBN
    978-1-4577-0930-2
  • Type

    conf

  • DOI
    10.1109/ICSDA.2011.6085989
  • Filename
    6085989