• DocumentCode
    381283
  • Title

    Automatic selection of transcribed training material

  • Author

    Kamm, Teresa M. ; Meyer, Gerard G L

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MD, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    417
  • Lastpage
    420
  • Abstract
    Conventional wisdom says that incorporating more training data is the surest way to reduce the error rate of a speech recognition system. This, in turn, guarantees that speech recognition systems are expensive to train, because of the high cost of annotating training data. We propose an iterative training algorithm that seeks to improve the error rate of a speech recognizer without incurring additional transcription cost, by selecting a subset of the already available transcribed training data. We apply the proposed algorithm to an alpha-digit recognition problem and reduce the error rate from 10.3% to 9.4% on a particular test set.
  • Keywords
    error statistics; iterative methods; learning (artificial intelligence); speech recognition; automatic training paradigm; error rate; iterative training algorithm; speech recognition; transcribed training material; Automatic speech recognition; Costs; Data mining; Error analysis; Iterative algorithms; Natural languages; Speech processing; Speech recognition; System testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
  • Print_ISBN
    0-7803-7343-X
  • Type

    conf

  • DOI
    10.1109/ASRU.2001.1034673
  • Filename
    1034673