• DocumentCode
    1950547
  • Title

    Bag-of-words representation for non-intrusive speech quality assessment

  • Author

    Qiaohong Li ; Weisi Lin ; Yuming Fang ; Thalmann, Daniel

  • Author_Institution
    Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
  • fYear
    2015
  • fDate
    12-15 July 2015
  • Firstpage
    616
  • Lastpage
    619
  • Abstract
    Research on non-intrusive speech quality assessment (SQA) aims to develop a computational model simulating the human perception of speech signals accurately and automatically without any prior information about the reference clean speech signals. In this paper, we propose to learn a non-intrusive SQA metric based on bag-of-words (BoW) representation of speech signals. In particular, the proposed method treats the whole speech utterance as a text document and extracts perceptual linear prediction (PLP) features of local segments as words. The speech utterance is then represented as a histogram of codewords, with each entry as the probability of a codeword appeared in the utterance. After the BoW representation of speech signals is obtained, support vector regression (SVR) is used to learn the metric for quality evaluation. Experimental results demonstrate that the proposed non-intrusive SQA metric BoW can obtain better performance than relevant state-of-the-art metrics.
  • Keywords
    regression analysis; speech processing; support vector machines; text analysis; BoW representation; PLP feature; SQA; SVR; bag-of-words representation; computational model; human perception; nonintrusive speech quality assessment; perceptual linear prediction feature; quality evaluation; reference clean speech signal; speech utterance; support vector regression; text document; Databases; Feature extraction; Histograms; Measurement; Quality assessment; Speech; Speech coding; bag of words; codebook construction; non-intrusive quality assessment; speech quality; support vector regression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
  • Conference_Location
    Chengdu
  • Type

    conf

  • DOI
    10.1109/ChinaSIP.2015.7230477
  • Filename
    7230477