• DocumentCode
    284580
  • Title

    Expanding the vocabulary of a connectionist recognizer trained on the DARPA Resource Management corpus

  • Author

    Lucke, H. ; Fallside, F.

  • Author_Institution
    Dept. of Eng., Cambridge Univ., UK
  • Volume
    1
  • fYear
    1992
  • fDate
    23-26 Mar 1992
  • Firstpage
    605
  • Abstract
    It is shown how the compositional representation (CR) previously used for lexical access from sub-word recognizers for a relatively small word vocabulary can be extended to much larger vocabularies without further training. This is demonstrated for the DARPA Resource Management database where, using sub-word units as input, words are presented distributively over a fixed number of units and classified using a simple network. Initially, the architecture is trained on 147 words achieving an accuracy 91.2%. Then, leaving the recognizer unchanged, it is shown how additional output units can be added to the network to increase the vocabulary to the complete set of 975 phonetically distinct words. On this extended vocabulary the performance dropped to 66% but this drop is less than the expected drop due to the perplexity increase. Further improvement would be achieved by improving the performance on the original data set
  • Keywords
    learning (artificial intelligence); neural nets; speech recognition equipment; vocabulary; DARPA Resource Management corpus; accuracy; compositional representation; connectionist recognizer; performance; perplexity; subword units; training; vocabulary expansion; Chromium; Data structures; Databases; Impedance; Neural networks; Recurrent neural networks; Resource management; Speech processing; Speech recognition; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on
  • Conference_Location
    San Francisco, CA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-0532-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.1992.225836
  • Filename
    225836