• DocumentCode
    1660993
  • Title

    Characterization of infant cries using spectral and prosodic features

  • Author

    Vempada, Ramu Reddy ; Kumar, B. Siva Ayyappa ; Rao, K. Sreenivasa

  • Author_Institution
    Sch. of Inf. Technol., Indian Inst. of Technol. Kharagpur, Kharagpur, India
  • fYear
    2012
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    In this paper, spectral and prosodic features are explored for recognition of infant cry. Different types of infant cries considered in this work are wet-diaper, hunger and pain. In this work, mel-frequency cepstral coefficients (MFCC) are used to represent the spectral information, and short-time frame energies (STE) and pause duration are used for representing the prosodic information. Support Vector Machines (SVM) are used to capture the discriminative information with respect to above mentioned cries from the spectral and prosodic features. SVM models are developed seperately using spectral and prosodic features. For carrying out these studies, infant cry database collected under Telemedicine project at IIT-KGP has been used. The recognition performance of the developed SVM models using spectral and prosodic features is observed to be 61.11% and 57.41% respectively. In this work, we also examined the recognition performance by combining the spectral and prosodic information at feature and score levels. The recognition performance using feature and score level fusion is observed to be 74.07% and 80.56% respectively.
  • Keywords
    speech recognition; support vector machines; IIT-KGP; MFCC; STE; SVM models; Telemedicine project; hunger cry; infant cry characterization; infant cry database; infant cry recognition; mel-frequency cepstral coefficients; pain cry; pause duration; prosodic features; prosodic information; short-time frame energies; spectral features; spectral information; support vector machines; wet-diaper cry; Mel frequency cepstral coefficient; Pain; Pediatrics; Speech; Speech recognition; Support vector machines; Vectors; Infant cry recognition; Prosodic features; Spectral features; Support Vector Machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications (NCC), 2012 National Conference on
  • Conference_Location
    Kharagpur
  • Print_ISBN
    978-1-4673-0815-1
  • Type

    conf

  • DOI
    10.1109/NCC.2012.6176851
  • Filename
    6176851