A robust speech/music discriminator for switched audio coding

Author

Guillaume Fuchs

Author_Institution

Fraunhofer Institut fü

fYear

2015

Firstpage

569

Lastpage

573

Abstract

Switching between speech coding and generic audio coding schemes was recently proven to be very efficient for coding a large range of audio materials at low bit-rates. However, it strongly relies on a robust classification of the input signal. The aim of the paper is to design a reliable speech and music discriminator (SMD) for such an application. Main attention was laid on getting a good tradeoff between accuracy, reactivity and stability of the decision while keeping the delay and complexity reasonably low. To this end, short-term and long-term features are dissociated before being conveyed to two different classifiers. The two classifier outputs are combined in a final decision using a hysteresis. Objective measures show that a more reliable switching decision is achievable. The SMD was successfully implemented in MPEG Unified Speech and Audio Coding (USAC). It allows the codec to show unprecedented audio quality.

Keywords

"Speech","Speech coding","Switches","Audio coding","Delays","Feature extraction"

Publisher

ieee

Conference_Titel

Signal Processing Conference (EUSIPCO), 2015 23rd European

Electronic_ISBN

2076-1465

Type

conf

DOI

10.1109/EUSIPCO.2015.7362447

Filename

7362447

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3715899