مرکز منطقه ای اطلاع رساني علوم و فناوري - A Gaussian Mixture Model layer jointly optimized with discriminative features within a Deep Neural Network architecture

DocumentCode :

3428820

Title :

A Gaussian Mixture Model layer jointly optimized with discriminative features within a Deep Neural Network architecture

Author :

Variani, Ehsan ; McDermott, Erik ; Heigold, Georg

Author_Institution :

Johns Hopkins Univ., Baltimore, MD, USA

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4270

Lastpage :

4274

Abstract :

This article proposes and evaluates a Gaussian Mixture Model (GMM) represented as the last layer of a Deep Neural Network (DNN) architecture and jointly optimized with all previous layers using Asynchronous Stochastic Gradient Descent (ASGD). The resulting “Deep GMM” architecture was investigated with special attention to the following issues: (1) The extent to which joint optimization improves over separate optimization of the DNN-based feature extraction layers and the GMM layer; (2) The extent to which depth (measured in number of layers, for a matched total number of parameters) helps a deep generative model based on the GMM layer, compared to a vanilla DNN model; (3) Head-to-head performance of Deep GMM architectures vs. equivalent DNN architectures of comparable depth, using the same optimization criterion (frame-level Cross Entropy (CE)) and optimization method (ASGD); (4) Expanded possibilities for modeling offered by the Deep GMM generative model. The proposed Deep GMMs were found to yield Word Error Rates (WERs) competitive with state-of-the-art DNN systems, at the cost of pre-training using standard DNNs to initialize the Deep GMM feature extraction layers. An extension to Deep Subspace GMMs is described, resulting in additional gains.

Keywords :

Gaussian processes; feature extraction; gradient methods; mixture models; neural nets; Gaussian mixture model layer; asynchronous stochastic gradient descent; deep neural network architecture; discriminative features; feature extraction layers; optimization criterion; word error rates; Feature extraction; Joints; Neural networks; Optimization; Speech; Speech recognition; Training; Deep neural networks; classification; feature extraction;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178776

Filename :

7178776

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3428820