• DocumentCode
    2700838
  • Title

    Automatic Detection of Sentence and Clause Units using Local Syntactic Dependency

  • Author

    Kawahara, Toshio ; Saikou, M. ; Takanashi, Koki

  • Author_Institution
    Acad. Center for Comput. & Media Studies, Kyoto Univ., Japan
  • Volume
    4
  • fYear
    2007
  • fDate
    15-20 April 2007
  • Abstract
    For robust detection of sentence and clause units in spontaneous speech such as lectures and meetings, we propose a novel cascaded chunking strategy which incorporates syntactic and semantic information. Application of general syntactic parsing is difficult for spontaneous speech having ill-formed sentences and disfluencies, especially for erroneous transcripts generated by ASR systems. Therefore, we focus on the local syntactic dependency of adjacent words and phrases, and train binary classifiers based on SVM (support vector machines) for this purpose. An experimental evaluation using spontaneous talks of the CSJ (Corpus of Spontaneous Japanese) demonstrates that the proposed dependency analysis can be robustly performed and is effective for clause/sentence unit detection in ASR outputs.
  • Keywords
    natural languages; speech recognition; support vector machines; Corpus of Spontaneous Japanese; SVM; cascaded chunking strategy; clause units; local syntactic dependency; semantic information; sentence automatic detection; sentence robust detection; support vector machines; Automatic speech recognition; Broadcast technology; Broadcasting; Humans; Performance analysis; Performance evaluation; Robustness; Speech analysis; Support vector machine classification; Support vector machines; SVM; chunking; clause unit; dependency analysis; sentence unit; spontaneous speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
  • Conference_Location
    Honolulu, HI
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0727-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2007.367179
  • Filename
    4218053