• DocumentCode
    3125064
  • Title

    Automatic pitch accent detection using auto-context with acoustic features

  • Author

    Junhong Zhao ; Wei-Qiang Zhang ; Hua Yuan ; Jia Liu ; Shanhong Xia

  • Author_Institution
    State Key Lab. on Transducing Technol., Inst. of Electron., Beijing, China
  • fYear
    2012
  • fDate
    5-8 Dec. 2012
  • Firstpage
    247
  • Lastpage
    251
  • Abstract
    In prosody event detection field, many local acoustic features have been proposed for representing the prosody characteristics of speech unit. The context information that represents some possible regularities underlying neighboring prosody events, however, hasn´t been used effectively. The main difficulty to utilize prosodic context is that it´s hard to capture the long-distance sequential dependency. In order to solve this problem, we introduce a new learning approach: auto-context. In this algorithm, a classifier is first trained based on local acoustic features; the discriminative probabilities produced by the classifier are selected as context information for the next iteration. Then a new classifier is trained by using the selected context information and local acoustic features. Repeating using the updated probabilities as the context information for the next iteration, the algorithm can boost recognition ability during its iterative process until converged. The merit of this method is that it can choose context information in a flexible way, while reserving reliable context information and abandoning unreliable ones. The experimental results showed that the proposed method improved the accuracy by absolutely about 1% for pitch accent detection.
  • Keywords
    acoustic signal detection; iterative methods; learning (artificial intelligence); probability; speech recognition; abandoning unreliable ones; auto-context; automatic pitch accent detection; boost recognition ability; discriminative probability; iterative process; learning approach; local acoustic features; long-distance sequential dependency; neighboring prosody events; prosodic context; prosody characteristics; prosody event detection field; reliable context information; speech unit; Accuracy; Acoustics; Context; Feature extraction; Labeling; Speech; Training; Pitch accent detection; acoustic; auto-context; prosody; support vector machines (SVMs);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
  • Conference_Location
    Kowloon
  • Print_ISBN
    978-1-4673-2506-6
  • Electronic_ISBN
    978-1-4673-2505-9
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2012.6423523
  • Filename
    6423523