• DocumentCode
    454584
  • Title

    Unsupervised Word Acquisition from Speech using Pattern Discovery

  • Author

    Park, Alex ; Glass, James R.

  • Author_Institution
    Comput. Sci. & Artificial Intelligence Lab., MIT, Cambridge, MA
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    In this paper, we present an unsupervised method for automatically discovering words from speech using a combination of acoustic pattern discovery, graph clustering, and baseform searching. The algorithm we propose represents an alternative to traditional methods of speech recognition and makes use of the acoustic similarity of multiple realizations of the same words or phrases. On a set of three academic lectures on different subjects, we show that the clustering component of the algorithm is able to successfully generate word clusters that have good coverage of subject-relevant words. Moreover, we illustrate how to use the cluster nodes to retrieve the word identity of each cluster from a large baseform dictionary. Results indicate that this algorithm may prove useful for applications such as vocabulary initialization, speech summarization, or augmentation of existing recognition systems
  • Keywords
    pattern clustering; speech recognition; acoustic pattern discovery; automatic word discovery; baseform dictionary; baseform searching; clustering component; graph clustering; speech recognition; speech summarization; subject-relevant words; unsupervised word acquisition; vocabulary initialization; word cluster generation; Artificial intelligence; Audio recording; Automatic speech recognition; Clustering algorithms; Computer science; Glass; Laboratories; Speech recognition; Streaming media; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1660044
  • Filename
    1660044