Speaker-independent features extracted by a neural network

Author

Kato, Y. ; Sugiyama, M.

Author_Institution

ATR Interpreting Telephony Res. Lab., Soraku-gun, Kyoto, Japan

Volume

1

fYear

1993

fDate

27-30 April 1993

Firstpage

553

Abstract

The authors propose an algorithm using a neural network to normalize features that differ between speakers in speaker-independent speech recognition. The algorithm has three procedures: (1) initially training a neural network, (2) calculating the alignment function between the target signal and the network´s output by dynamic time warping, and (3) incrementally training the network for extracting speaker-independent features. The neural network is a fuzzy partition model (FPM) with multiple input-output units to give a probabilistic formulation. The algorithm was evaluated in phrase recognition experiments by FPM-LR recognizers. The FPM was directly combined with a LR parser. The algorithm is compared with a conventional training algorithm in terms of recognition performance. The experimental results show that a neural network can be used as a new speaker-independent feature extractor.<>

Keywords

feature extraction; fuzzy logic; grammars; learning (artificial intelligence); neural nets; speech recognition; LR parser; algorithm; alignment function; dynamic time warping; fuzzy partition model; neural network; probabilistic formulation; recognition performance; speaker-independent feature extractor; speaker-independent speech recognition; training;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on

Conference_Location

Minneapolis, MN, USA

ISSN

1520-6149

Print_ISBN

0-7803-7402-9

Type

conf

DOI

10.1109/ICASSP.1993.319178

Filename

319178