DocumentCode :
1414298
Title :
Sound Indexing Using Morphological Description
Author :
Peeters, Geoffroy ; Deruty, Emmanuel
Author_Institution :
Sound Anal./Synthesis Team, IRCAM, Paris, France
Volume :
18
Issue :
3
fYear :
2010
fDate :
3/1/2010 12:00:00 AM
Firstpage :
675
Lastpage :
687
Abstract :
Sound sample indexing usually deals with the recognition of the source/cause that has produced the sound. For abstract sounds, sound effects, unnatural, or synthetic sounds, this cause is usually unknown or unrecognizable. An efficient description of these sounds has been proposed by Schaeffer under the name morphological description. Part of this description consists in describing a sound by identifying the temporal evolution of its acoustic properties to a set of profiles. In this paper, we consider three morphological descriptions: dynamic profiles (ascending, descending, ascending/descending, stable, impulsive), melodic profiles (up, down, stable, up/down, down/up) and complex-iterative sound description (non-iterative, iterative, grain, repetition). We study the automatic indexing of a sound into these profiles. Because this automatic indexing is difficult using standard audio features, we propose new audio features to perform this task. The dynamic profiles are estimated by modeling the loudness over-time of a sound by a second-order B-spline model and derive features from this model. The melodic profiles are estimated by tracking over time the perceptual filter which has the maximum excitation. A function is derived from this track which is then modeled using a second-order B-spline model. The features are again derived from the B-spline model. The description of complex-iterative sounds is obtained by estimating the amount of repetition and the period of the repetition. These are obtained by computing an audio similarity function derived from an Mel frequency cepstral coefficients (MFCC) similarity matrix. The proposed audio features are then tested for automatic classification. We consider three classification tasks corresponding to the three profiles. In each case, the results are compared with the ones obtained using standard audio features.
Keywords :
audio signal processing; splines (mathematics); complex-iterative sound description; dynamic profiles; frequency cepstral coefficients; melodic profiles; morphological description; second-order B-spline model; sound indexing; Automatic testing; Filters; Instruments; Machine assisted indexing; Mel frequency cepstral coefficient; Search engines; Spline; Timbre; Trademarks; Usability; Audio features; audio similarity; automatic indexing; loudness; sound description;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2009.2038809
Filename :
5410058
Link To Document :
بازگشت