مرکز منطقه ای اطلاع رساني علوم و فناوري - Advances in speech and audio processing and coding

Abstract :

This plenary session will cover speech processing research advances with the emphasis on speech and audio coding methods. In the session, we will discuss the fundamental principles, techniques, and algorithms used in current coding applications including a summary of codecs for telecommunication standards. The session will start with a discussion on: the basic speech representation methods, the performance measures used to evaluate coded speech, and the role of the standards. Brief algorithm descriptions include: ADPCM, sub-band coding, adaptive transform coding, sinusoidal transform coding (STC), linear predictive coding (LPC), and analysis-by-synthesis LPC (sparse excitation, code excited LPC, and ACELP). The presentation will feature audio, and computer demonstrations of recent speech coding standards including voice-over IP algorithms. The plenary session will also cover wideband audio standards such as MPEG audio and other layers (e.g., MP3, AAC). Recent algorithms will also be described including the following: Variable-Rate Multimode Wideband (VMR-WB), Speex, G722.1, OGG Vorbis 2012, iLBC, SELT, SILK, Opus 2013, Qualcomm wideband 5G codecs. At the end of the session, we will cover briefly recent applications that use voice features for detecting speech pathologies, and also discuss how long-term speech parameters can be used as predictors of other diseases such as tremors, Alzheimer´s etc.