DocumentCode :
1756649
Title :
Combining Spectral and Temporal Representations for Multipitch Estimation of Polyphonic Music
Author :
Li Su ; Yi-Hsuan Yang
Author_Institution :
Res. Center for Inf. Technol. Innovation, Taipei, Taiwan
Volume :
23
Issue :
10
fYear :
2015
fDate :
Oct. 2015
Firstpage :
1600
Lastpage :
1612
Abstract :
Due to the difficulty of creating pitch-labeled training data that cover the rich diversity found in music signals, unsupervised feature-based approaches derived from signal processing and feature design remain critical for multipitch estimation (MPE) of polyphonic music. While a large number of feature representations have been proposed in the literature, an effective means of combining different domains of features for MPE is still needed. In this paper, we propose a novel approach, referred to as combined frequency and periodicity (CFP), that detects pitches according to the agreement of a harmonic series in the frequency domain and a subharmonic series in the lag (quefrency) domain. This approach nicely aggregates the complementary advantages of the two feature domains in different frequency ranges, and improves the robustness of the pitch detection function to the interference of the overtones of simultaneous pitches. We report a comprehensive evaluation that compares CFP against three state-of-the-art approaches using three MPE datasets and four symphonies. The evaluation is characteristic of the coverage and complexity of music (in terms of instrument type and degree of polyphony). In addition, we also evaluate the performance of the MPE approaches when a number of audio degradations are applied. Results show that the proposed unsupervised method performs consistently well across the types of Western polyphonic music considered, and is robust to audio degradations such as high-pass filtering and MP3 compression.
Keywords :
music; signal processing; MP3 compression; audio degradations; combined frequency and periodicity; frequency domain; high-pass filtering; multipitch estimation; pitch detection function; pitch-labeled training data; polyphonic music; signal processing; subharmonic series; Frequency-domain analysis; IEEE transactions; Instruments; Multiple signal classification; Robustness; Speech; Speech processing; Automatic music transcription; generalized cepstrum; multipitch estimation; unsupervised approach;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2015.2442411
Filename :
7118691
Link To Document :
بازگشت