Title of article :
Modulation-Scale Analysis for Content Identification.
Author/Authors :
S. Sukittanon، نويسنده , , L. E. Atlas، نويسنده , , and J. W. Pitton، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی 2 سال 2004
Abstract :
For nonstationary signal classification, e.g., speech or
music, features are traditionally extracted from a time-shifted, yet
short data window. For many applications, these short-term features
do not efficiently capture or represent longer term signal
variation. Partially motivated by human audition, we overcome the
deficiencies of short-term features by employing modulation-scale
analysis for long-term feature analysis. Our analysis, which uses
time-frequency theory integrated with psychoacoustic results on
modulation frequency perception, not only contains short-term information
about the signals, but also provides long-term information
representing patterns of time variation. This paper describes
these features and their normalization. We demonstrate the effectiveness
of our long-term features over conventional short-term
features in content-based audio identification. A simulated study
using a large data set, including nearly 10 000 songs and requiring
over a billion audio pairwise comparisons, shows that modulationscale
features improves content identification accuracy substantially,
especially when time and frequency distortions are imposed.
Keywords :
Audio fingerprinting , audio identification , audioretrieval , auditory classification , content identification , Featureextraction , long-term features , modulationfeatures , short-term features , 2-D features. , feature normalization , modulation spectrum , Pattern recognition , modulation scale
Journal title :
IEEE TRANSACTIONS ON SIGNAL PROCESSING
Journal title :
IEEE TRANSACTIONS ON SIGNAL PROCESSING