• DocumentCode
    1690121
  • Title

    Automatic localization of a language-independent sub-network on deep neural networks trained by multi-lingual speech

  • Author

    Matsuda, Shodai ; Xugang Lu ; Kashioka, Hideki

  • Author_Institution
    Spoken Language Commun. Lab., Nat. Inst. of Inf. & Commun. Technol., Kyoto, Japan
  • fYear
    2013
  • Firstpage
    7359
  • Lastpage
    7362
  • Abstract
    Deep neural networks (DNNs) have been successfully applied to automatic speech recognition (ASR). However, no study has investigated the possibility of building a language-independent sub-network DNN as the basis for further training of any new language using a simple plug-in of the sub-network. In this paper, we propose a novel technique to split a DNN into language-independent and -dependent sub-networks using multi-lingual speech training data. Our basic assumption is that, in a DNN for speech processing, language-independent feature processing is done in stages that are near to the input layer, while language-dependent processing is performed in stages that are near to the output layer. Based on this assumption, we propose a technique to simultaneously optimize multiple sub-networks in a DNN trained with multi-lingual speech data. The language-dependent and -independent processing boundaries in individual sub-networks are segmented automatically. We test our technique in phoneme classification experiments. The results demonstrate that a language-independent sub-network DNN extracted by our technique can be used as a universal network for speech processing of additional new languages.
  • Keywords
    learning (artificial intelligence); neural nets; pattern classification; speech recognition; ASR; DNN; automatic localization; automatic speech recognition; deep neural network training; language-dependent subnetwork feature processing; language-independent subnetwork feature processing; multilingual speech training data; phoneme classification experiment; speech processing; Acoustics; Neural networks; Neurons; Speech; Speech processing; Speech recognition; Training; Deep Neural Network; Restricted Boltzmann Machine; Speech Recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639092
  • Filename
    6639092