• DocumentCode
    294531
  • Title

    Implementation of the POW (phonetically optimized words) algorithm for speech database

  • Author

    Lim, Yeonja ; Lee, Youngjik

  • Author_Institution
    Autom. Interpretation Section, Electron. & Telecommun. Res. Inst., Seoul, South Korea
  • Volume
    1
  • fYear
    1995
  • fDate
    9-12 May 1995
  • Firstpage
    89
  • Abstract
    The paper proposes the concept of the POW (phonetically optimized words) set. To collect a speech database, all possible phonological phenomenon should be included. In addition, it is preferable to have the same phonological distribution as the general speech. For this purpose, the authors suggest a new algorithm for selecting a word set which has the properties that (1) it includes all phonological events, (2) it has the minimal number of words, and (3) the phonological similarity between the POW set and the high-frequency word set is maximized. The authors extract the Korean POW set from 50000 high-frequency words out of 3 million text corpus. The POW set is much more similar to the high-frequency word set than the PBW (phonetically balanced words) set with less number of words
  • Keywords
    natural languages; speech recognition; Korean; POW; algorithm; phonetically optimized words; phonological distribution; speech database; word set; Databases; Entropy; Error analysis; Frequency; Information theory; Large-scale systems; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
  • Conference_Location
    Detroit, MI
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-2431-5
  • Type

    conf

  • DOI
    10.1109/ICASSP.1995.479280
  • Filename
    479280