• DocumentCode
    1756384
  • Title

    Quantifying Spatiotemporal Properties of Vocal Fold Dynamics Based on a Multiscale Analysis of Phonovibrograms

  • Author

    Unger, Jonas ; Hecker, D.J. ; Kunduk, Melda ; Schuster, Martin ; Schick, B. ; Lohscheller, J.

  • Author_Institution
    Dept. of Comput. Sci., Trier Univ. of Appl. Sci., Trier, Germany
  • Volume
    61
  • Issue
    9
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    2422
  • Lastpage
    2433
  • Abstract
    In order to objectively assess the laryngeal vibratory behavior, endoscopic high-speed cameras capture several thousand frames per second of the vocal folds during phonation. However, judging all inherent clinically relevant features is a challenging task and requires well-founded expert knowledge. In this study, an automated wavelet-based analysis of laryngeal high-speed videos based on phonovibrograms is presented. The phonovibrogram is an image representation of the spatiotemporal pattern of vocal fold vibration and constitutes the basis for a computer-based analysis of laryngeal dynamics. The features extracted from the wavelet transform are shown to be closely related to a basic set of video-based measurements categorized by the European Laryngological Society for a subjective assessment of pathologic voices. The wavelet-based analysis further offers information about irregularity and lateral asymmetry and asynchrony. It is demonstrated in healthy and pathologic subjects as well as for a surgical group that was examined before and after the removal of a vocal fold polyp. The features were found to not only classify glottal closure characteristics but also quantify the impact of pathologies on the vibratory behavior. The interpretability and the discriminative power of the proposed feature set show promising relevance for a computer-assisted diagnosis and classification of voice disorders.
  • Keywords
    endoscopes; feature extraction; image representation; medical disorders; medical image processing; spatiotemporal phenomena; speech; vibrations; video cameras; wavelet transforms; European Laryngological Society; asynchrony; automated wavelet-based analysis; computer-based analysis; endoscopic high-speed cameras; feature extraction; glottal closure characteristics; image representation; irregularity; laryngeal high-speed videos; laryngeal vibratory behavior; lateral asymmetry; multiscale analysis; pathologic voices; phonation; phonovibrograms; spatiotemporal properties; vocal fold dynamics; vocal fold polyp; voice disorders; wavelet transform; Correlation; Eigenvalues and eigenfunctions; Feature extraction; Vibrations; Videos; Wavelet analysis; Wavelet transforms; Computer-aided analysis; high speed videoendoscopy (HSV); multiscale product (MSP); vocal fold dynamics; wavelet analysis;
  • fLanguage
    English
  • Journal_Title
    Biomedical Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9294
  • Type

    jour

  • DOI
    10.1109/TBME.2014.2318774
  • Filename
    6804672