• DocumentCode
    1625665
  • Title

    Active learning of EHVS parser for Persian language understanding

  • Author

    Tajgardoon, M.A. ; Jabbari, Fattaneh ; Sameti, Hossein ; Bahaadini, S.

  • Author_Institution
    Inf. Technol. Dept., Modiran Vehicle Manuf. (MVM), Tehran, Iran
  • fYear
    2012
  • Firstpage
    827
  • Lastpage
    832
  • Abstract
    One of the main elements of a spoken dialogue system is the Spoken Language Understanding (SLU) unit. Hidden Vector State (HVS) is one of the popular statistical methods applied to the SLU component. Extended Hidden Vector State (EHVS) is an enhanced version of the HVS. Although both parsers need only abstract data annotation, it is quiet time consuming and difficult to label the data. Thus, we present a novel active learning method for the EHVS parser to reduce the human labeling effort. The active learner makes use of pattern classification to select the informative data based on four different uncertainty measures. Experiments are done on a Persian dataset, the University Information Kiosk corpus. The experimental results show the improvements in performance of the active EHVS which has been improved 15.46% in the case of entropy-probability uncertainty measure. This reveals the effectiveness and feasibility of the proposed approach.
  • Keywords
    entropy; grammars; interactive systems; learning (artificial intelligence); natural language processing; pattern classification; statistical analysis; EHVS parser; Persian dataset; Persian language understanding; SLU unit; University Information Kiosk corpus; abstract data annotation; active EHVS; active learning method; entropy-probability uncertainty measure; extended hidden vector state; informative data; pattern classification; spoken dialogue system; spoken language understanding; statistical methods; uncertainty measures; Entropy; Measurement uncertainty; Semantics; Support vector machines; Training; Uncertainty; Vectors; Active EHVS; EHVS; Spoken language understanding; Uncertainty measure;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Telecommunications (IST), 2012 Sixth International Symposium on
  • Conference_Location
    Tehran
  • Print_ISBN
    978-1-4673-2072-6
  • Type

    conf

  • DOI
    10.1109/ISTEL.2012.6483100
  • Filename
    6483100