DocumentCode :
1086096
Title :
A description of a parametrically controlled modular structure for speech processing
Author :
Dixon, N. ; Silverman, Harvey F.
Author_Institution :
IBM Thomas J. Watson Research Center, Yorktown Heights, N.Y.
Volume :
23
Issue :
1
fYear :
1975
fDate :
2/1/1975 12:00:00 AM
Firstpage :
87
Lastpage :
91
Abstract :
A system, the modular acoustic processor (MAP) consisting of two major components, has been designed for work in speech recognition. A versatile spectral analysis system, the parametrically controlled analyzer (PCA), serves as input to an hierarchically operated string transcriber (HOST). In the design of this system, controllability and modularity for developmental extensibility were primary concerns. The system, with the exception of initial high-fidelity, direct A/D conversion, is entirely implemented in software, PL/I, with appropriate JCL structures for running under OS/MVT on an IBM 360-91. As an adjunct for obtaining training data, a grayscale interactive system using an IBM 1800 process-control computer has also been implemented. PCA signal processing features parametric selection of several analysis methods, including discrete Fourier transform (DFT), linear predictive coding (LPC), and chirp z-transform (CZT). Also, selection may be made among various smoothing, normalization, interpolation, and F0estimation methods. PCA develops high-quality spectrographic representations of speech for standard line printers, CRT display, and subsequent processing. PCA also performs spectral-similarity matching and training. HOST consists of a number of processes for performing segmentation, classification, and prosody analysis. Provision is made for complete commutability at the module level as well as at the algorithm level. The segmentation/classification output of HOST is augmented by estimates of confidence. PCA is a packaged, debugged, running system. A first version of HOST is operational.
Keywords :
Control system analysis; Control systems; Controllability; Discrete Fourier transforms; Linear predictive coding; Operating systems; Principal component analysis; Spectral analysis; Speech processing; Speech recognition;
fLanguage :
English
Journal_Title :
Acoustics, Speech and Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
0096-3518
Type :
jour
DOI :
10.1109/TASSP.1975.1162630
Filename :
1162630
Link To Document :
بازگشت