Title :
Auditory distortion measure for speech coding
Author :
Wang, Shihua ; Sekey, Andrew ; Gersho, Allen
Author_Institution :
Dept. of Electr. & Comput. Eng., California Univ., Santa Barbara, CA, USA
Abstract :
A novel perceptually motivated objective measure for estimating the subjective quality of coded speech is presented. It takes into account auditory frequency warping (Bark transformation), critical-band integration, amplitude sensitivity variations with frequency, and conversion from loudness level to loudness. For each 10 ms segment of an utterance, a weighted spectral vector is computed via 15 critical band filters. The overall distortion, called Bark spectral distortion (BSD), is the average squared Euclidean distance between spectral vectors of the original and coded utterance. In tests with speech distorted by a modulated noise reference unit or coded at rates of 2.4-64 kb/s, the measure predicted mean opinion score (MOS) ratings are notably better than segmental SNR. The standard error in estimating MOS scores with the new measure was 0.2-0.3
Keywords :
encoding; hearing; speech analysis and processing; speech intelligibility; 10 ms; 2.4 to 64 kbit/s; Bark spectral distortion; Bark transformation; amplitude sensitivity variations; auditory distortion measure; auditory frequency warping; average squared Euclidean distance; bearing; coded utterance; critical band filters; critical-band integration; loudness level; measure predicted mean opinion score; modulated noise reference unit; speech coding; speech intelligibility; standard error; subjective quality; weighted spectral vector; Distortion measurement; Euclidean distance; Filters; Frequency conversion; Modulation coding; Noise measurement; Signal to noise ratio; Speech coding; Speech enhancement; Testing;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on
Conference_Location :
Toronto, Ont.
Print_ISBN :
0-7803-0003-3
DOI :
10.1109/ICASSP.1991.150384