Author_Institution :
Tagtraum Ind. Inc., Raleigh, NC, USA
Abstract :
In view of rapidly growing digital music collections and ubiquitous music consumption, the development of technologies for identifying, browsing, and managing audio content has become a major strand of research. In this context, audio identification (ID) systems for identifying audio recordings by means of short query audio clips have become of commercial relevance. In this paper, we take a closer look at a widely used audio ID system originally developed by Haitsma and Kalker and propose several modifications that yield significant improvements with regard to retrieval speed and storage requirements. As the main contribution, we introduce a measure that establishes a connection between the temporal correlation of hash values (used for indexing) and their ability to survive in the presence of noise and signal distortions. Based on this measure, we improve the overall performance of the audio ID system by means of four strategies. First, we change the way fingerprints (audio features) are generated to increase their reliability. Second, by prioritizing more reliable hash values when searching for reference entries, we achieve substantial gains in retrieval speed by a factor of almost seven. Third, by enlarging the query fingerprint, we increase our chances of identifying reliable hash values. Fourth, by indexing only the most reliable hashes, thus applying a sub-sampling strategy, we significantly lower the server side storage requirements by a factor of ten.
Keywords :
audio signal processing; content-based retrieval; music; sampling methods; audio ID system; audio content browsing; audio content identification; audio content management; audio features; audio recordings; digital music collections; hash values; index-based audio identification; query fingerprint; retrieval speed; server side storage requirements; short query audio clips; storage requirements; sub-sampling strategy; ubiquitous music consumption; Bit error rate; Databases; Distortion measurement; Monitoring; Robustness; Standards; Audio identification; content-based retrieval; fingerprint; indexing; music;