• DocumentCode
    312123
  • Title

    Estimation of language models for new spoken language applications

  • Author

    Issar, Sunil

  • Author_Institution
    Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    869
  • Abstract
    Spoken language interfaces can provide natural communication for many database retrieval tasks. The CMU ATIS system provides an example of accessing airline information using spoken natural language queries. However, a lot of training data is needed to develop a spoken language application. For example, one needs training data to generate a language model that can be used by the recognizer to reduce the search space. The author addresses some issues arising from small amount of training data available for a new spoken language application. The author is working on a spoken language interface to access information from a library catalogue. The catalogue contains around 13,000 titles, 6000 authors and 19000 subjects. There an more than 20,000 words in the dictionary. The user can seek information about books, authors, subjects, publishers, etc. For example, “I´d like to see books dealing with Science fiction by Clarke.” The author describes some language modelling experiments for this task. The author briefly describes a speech interface for a library catalogue. The author also reviews class-based language models and describes their limitations. Finally, the author presents the approach to building statistical language models for new spoken language applications. This is important because a lot of training data is normally needed to generate a language model. However, it is not practical to have or collect a large corpus of data for each new spoken language application
  • Keywords
    library automation; natural language interfaces; query processing; speech recognition; CMU ATIS system; airline information access; class-based language models; database retrieval tasks; dictionary; language model estimation; library catalogue; natural communication; search space; speech recognizer; spoken language applications; spoken language interfaces; spoken natural language queries; statistical language models; training data; Application software; Books; Databases; Dictionaries; Libraries; Mutual funds; Natural languages; Speech; Telephony; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607739
  • Filename
    607739