• DocumentCode
    180191
  • Title

    Unsupervised query-by-example spoken term detection using segment-based Bag of Acoustic Words

  • Author

    George, Boby ; Yegnanarayana, B.

  • Author_Institution
    Speech & Vision Lab., Int. Inst. of Inf. Technol., Hyderabad, India
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    7133
  • Lastpage
    7137
  • Abstract
    In this work, we present an unsupervised framework to address the problem of spotting spoken terms in large speech databases. The segment-based Bag of Acoustic Words (BoAW) framework proposed is inspired from the Bag of Words (BoW) approach widely used in text retrieval systems. Since this model ignores the sequence information in speech samples for efficient indexing of the database, a Dynamic Time Warping (DTW) based temporal matching technique is used to re-rank the results and restore the time sequence information. The speech data is stored efficiently in an inverted index which makes the retrieval very fast, thus making this framework particularly useful for searching large databases. We address the issue of choosing the appropriate size of the segment of speech for reliable indexing. Comparison with other query-by-example spoken term detection systems shows that the proposed system outperforms the rest.
  • Keywords
    query processing; speech recognition; unsupervised learning; dynamic time warping; segment based bag of acoustic words; sequence information; temporal matching technique; text retrieval systems; unsupervised query by example spoken term detection; Acoustics; Histograms; Indexing; Speech; Vocabulary; Bag of Acoustic Words; query-by-example; segment ranking; spoken term detection; template matching; unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854984
  • Filename
    6854984