DocumentCode
792304
Title
Musical genre classification of audio signals
Author
Tzanetakis, George ; Cook, Perry
Author_Institution
Dept. of Comput. Sci., Princeton Univ., NJ, USA
Volume
10
Issue
5
fYear
2002
fDate
7/1/2002 12:00:00 AM
Firstpage
293
Lastpage
302
Abstract
Musical genres are categorical labels created by humans to characterize pieces of music. A musical genre is characterized by the common characteristics shared by its members. These characteristics typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. Genre hierarchies are commonly used to structure the large collections of music available on the Web. Currently musical genre annotation is performed manually. Automatic musical genre classification can assist or replace the human user in this process and would be a valuable addition to music information retrieval systems. In addition, automatic musical genre classification provides a framework for developing and evaluating features for any type of content-based analysis of musical signals. In this paper, the automatic classification of audio signals into an hierarchy of musical genres is explored. More specifically, three feature sets for representing timbral texture, rhythmic content and pitch content are proposed. The performance and relative importance of the proposed features is investigated by training statistical pattern recognition classifiers using real-world audio collections. Both whole file and real-time frame-based classification schemes are described. Using the proposed feature sets, classification of 61% for ten musical genres is achieved. This result is comparable to results reported for human musical genre classification.
Keywords
audio signal processing; feature extraction; information retrieval; music; pattern recognition; signal classification; statistical analysis; World Wide Web; audio signals; automatic musical genre classification; content-based analysis; feature sets; genre hierarchies; harmonic content; human musical genre classification; instrumentation; music information retrieval systems; musical genre annotation; musical signals; pitch content; real-time frame-based classification; rhythmic content; rhythmic structure; statistical pattern recognition classifiers training; timbral texture; whole file classification; Computer science; Cultural differences; Feature extraction; Humans; Instruments; Multiple signal classification; Music information retrieval; Pattern recognition; Signal analysis; Wavelet analysis;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/TSA.2002.800560
Filename
1021072
Link To Document