An algorithm for automatic formant extraction using linear prediction spectra

Author

McCandless, Stephanie S.

Author_Institution

Massachusetts Institute of Technology, Lexington, Mass

Volume

22

Issue

2

fYear

1974

fDate

4/1/1974 12:00:00 AM

Firstpage

135

Lastpage

141

Abstract

An algorithm is presented which finds the frequency and amplitude of the first three formants during all vowel-like segments of continuous speech. It uses as input the peaks of the linear prediction spectra and a segmentation parameter to indicate energy and voicing. Ideally, the first three peaks are the first three formants. Frequently, however, two peaks merge, or spurious peaks appear, and the difficult part is to recognize such situations and deal with them. The general method is to fill formant slots with the available peaks at each frame, based on frequency position relative to an educated guess. Then, if a peak is left over and/or a slot is unfilled, special routines are called to decide how to deal with them. Included is a formant enhancement technique, analogous to a similar technique which has been implemented via the chirp-z transform [8], which usually succeeds in separating two merged formants. Processing begins at the middle of each high volume voiced segment, where formants are most likely to be correct, and branches outward from there in both directions in time, using the most recently found formant frequencies as the educated guess for the current frame. The algorithm has been implemented at Lincoln Laboratory on the Univac 1219 and the Fast Digital Processor, a programmable processor [9], and has been tested on a large number of unrestricted sentences.

Keywords

Chirp; Frequency synthesizers; Helium; Laboratories; Spectral shape; Speech analysis; Speech processing; Speech recognition; Speech synthesis; Testing;

fLanguage

English

Journal_Title

Acoustics, Speech and Signal Processing, IEEE Transactions on

Publisher

ieee

ISSN

0096-3518

Type

jour

DOI

10.1109/TASSP.1974.1162559

Filename

1162559