DocumentCode
294531
Title
Implementation of the POW (phonetically optimized words) algorithm for speech database
Author
Lim, Yeonja ; Lee, Youngjik
Author_Institution
Autom. Interpretation Section, Electron. & Telecommun. Res. Inst., Seoul, South Korea
Volume
1
fYear
1995
fDate
9-12 May 1995
Firstpage
89
Abstract
The paper proposes the concept of the POW (phonetically optimized words) set. To collect a speech database, all possible phonological phenomenon should be included. In addition, it is preferable to have the same phonological distribution as the general speech. For this purpose, the authors suggest a new algorithm for selecting a word set which has the properties that (1) it includes all phonological events, (2) it has the minimal number of words, and (3) the phonological similarity between the POW set and the high-frequency word set is maximized. The authors extract the Korean POW set from 50000 high-frequency words out of 3 million text corpus. The POW set is much more similar to the high-frequency word set than the PBW (phonetically balanced words) set with less number of words
Keywords
natural languages; speech recognition; Korean; POW; algorithm; phonetically optimized words; phonological distribution; speech database; word set; Databases; Entropy; Error analysis; Frequency; Information theory; Large-scale systems; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location
Detroit, MI
ISSN
1520-6149
Print_ISBN
0-7803-2431-5
Type
conf
DOI
10.1109/ICASSP.1995.479280
Filename
479280
Link To Document