A dynamic programming approach to audio segmentation and speech/music discrimination

Author

Goodwin, Michael M. ; Laroche, Jean

Author_Institution

Creative Adv. Technol. Center, Scotts Valley, CA, USA

Volume

4

fYear

2004

fDate

17-21 May 2004

Abstract

We consider the problem of segmenting an audio signal into characteristic regions based on feature-set similarities. In the proposed approach, a feature-space representation of the signal is generated; sequences of these feature-space samples are then aggregated into clusters corresponding to distinct signal regions. The algorithm consists of using linear discriminant analysis (LDA) to condition the feature space and dynamic programming (DP) to identify data clusters. We consider the design of the dynamic program cost functions; we are able to derive effective cost functions without relying on significant prior information about the structure of the expected data clusters. We demonstrate the application of the LDA-DP segmentation algorithm to speech/music discrimination. Experimental results are given and discussed.

Keywords

audio signal processing; dynamic programming; music; speech; speech processing; audio segmentation; audio signal segmentation; data clusters; dynamic program cost functions; dynamic programming; feature-space representation; linear discriminant analysis; signal representation; speech/music discrimination; Clustering algorithms; Cost function; Covariance matrix; Dynamic programming; Fingerprint recognition; Linear discriminant analysis; Multiple signal classification; Robustness; Signal generators; Speech;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1326825

Filename

1326825