• DocumentCode
    1425224
  • Title

    A Low-Complexity Spectro-Temporal Distortion Measure for Audio Processing Applications

  • Author

    Taal, Cees H. ; Hendriks, Richard C. ; Heusdens, Richard

  • Author_Institution
    Sound & Image Process. Lab., R. Inst. of Technol. (KTH), Stockholm, Sweden
  • Volume
    20
  • Issue
    5
  • fYear
    2012
  • fDate
    7/1/2012 12:00:00 AM
  • Firstpage
    1553
  • Lastpage
    1564
  • Abstract
    Perceptual models exploiting auditory masking are frequently used in audio and speech processing applications like coding and watermarking. In most cases, these models only take into account spectral masking in short-time frames. As a consequence, undesired audible artifacts in the temporal domain may be introduced (e.g., pre-echoes). In this article we present a new low-complexity spectro-temporal distortion measure. The model facilitates the computation of analytic expressions for masking thresholds, while advanced spectro-temporal models typically need computationally demanding adaptive procedures to find an estimate of these masking thresholds. We show that the proposed method gives similar masking predictions as an advanced spectro-temporal model with only a fraction of its computational power. The proposed method is also compared with a spectral-only model by means of a listening test. From this test it can be concluded that for non-stationary frames the spectral model underestimates the audibility of introduced errors and therefore overestimates the masking curve. As a consequence, the system of interest incorrectly assumes that errors are masked in a particular frame, which leads to audible artifacts. This is not the case with the proposed method which correctly detects the errors made in the temporal structure of the signal.
  • Keywords
    audio coding; audio watermarking; error analysis; speech coding; advanced spectrotemporal model; analytic expression; audible artifacts; audio processing application; auditory masking; computational power; computationally demanding adaptive procedure; error audibility; error detection; low complexity spectrotemporal distortion measure; masking curve; nonstationary frame; signal temporal structure; spectral masking; speech coding; speech processing; speech watermarking; temporal domain; Approximation methods; Computational modeling; Distortion measurement; Masking threshold; Mathematical model; Noise; Audio coding; auditory modeling; perceptual model;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2012.2184753
  • Filename
    6133329