Title :
Piano music transcription modeling note temporal evolution
Author :
Cogliati, Andrea ; Zhiyao Duan
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Rochester, Rochester, NY, USA
Abstract :
Automatic music transcription (AMT) is the process of converting an acoustic musical signal into a symbolic musical representation such as a MIDI piano roll, which contains the pitches, the onsets and offsets of the notes and, possibly, their dynamic and source (i.e., instrument). Existing algorithms for AMT commonly identify pitches and their saliences in each frame and then form notes in a post-processing stage, which applies a combination of thresholding, pruning and smoothing operations. Very few existing methods consider the note temporal evolution over multiple frames during the pitch identification stage. In this work we propose a note-based spectrogram factorization method that uses the entire temporal evolution of piano notes as a template dictionary. The method uses an artificial neural network to detect note onsets from the audio spectral flux. Next, it estimates the notes present in each audio segment between two successive onsets with a greedy search algorithm. Finally, the spectrogram of each segment is factorized using a discrete combination of note templates comprised of full note spectrograms of individual piano notes sampled at different dynamic levels. We also propose a new psychoacoustically informed measure for spectrogram similarity.
Keywords :
acoustic signal processing; audio signal processing; greedy algorithms; musical acoustics; neural nets; search problems; signal representation; smoothing methods; AMT; MIDI piano roll; acoustic musical signal; artificial neural network; audio segment; audio spectral flux; automatic music transcription; greedy search algorithm; note spectrograms; note templates; note temporal evolution; note-based spectrogram factorization method; piano music transcription modeling; piano notes; pitch identification stage; post-processing stage; smoothing operations; spectrogram similarity; symbolic musical representation; template dictionary; Acoustics; Dictionaries; Estimation; Heuristic algorithms; Multiple signal classification; Neural networks; Spectrogram; Automatic music transcription; multi-pitch estimation; onset detection; spectrogram factorization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178005