• DocumentCode
    336736
  • Title

    Using a large vocabulary continuous speech recognizer for a constrained domain with limited training

  • Author

    Siu, Manhung ; Jonas, Michael ; Gish, Herbert

  • Author_Institution
    BBN Technol./GTE Internetworking, Cambridge, UK
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    105
  • Abstract
    How to train a speech recognizer with limited amount of training data is of interest to many researcher. We describe how we use BBN´s Byblos large vocabulary continuous speech recognition (LVCSR) system for the military air-traffic-control domain where we have less than an hour of training data. We investigate three ways to deal with the limited training data: (1) re-configure the LVCSR system to use fewer parameters, (2) incorporate out-of-domain data, and, (3) use pragmatic information, such as speaker identity and controller function to improve recognition performance. We compare the LVCSR performance to that of the tied-mixture recognizer that is designed for a limited vocabulary. We show that the reconfigured LVCSR system outperforms the tied-mixture system by 10% in absolute word error rate. When enough data is available per speaker, vocal tract length normalization and supervised adaptation techniques can further improve performance by 6% even for this domain with limited training. We also show that the use of out-of-domain data and pragmatic information, if available, can each further improve performance by 1-3%
  • Keywords
    air traffic control; learning (artificial intelligence); military communication; military computing; speech recognition; BBN; Byblos LVCSR system; absolute word error rate; constrained domain; controller function; large vocabulary continuous speech recognizer; limited training; military air-traffic-control domain; out-of-domain data; parameters; pragmatic information; recognition performance; speaker identity; supervised adaptation techniques; tied-mixture recognizer; training data; vocal tract length normalization; Air traffic control; Control systems; Engines; Error analysis; Interleaved codes; Loudspeakers; Speech recognition; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758073
  • Filename
    758073