مرکز منطقه ای اطلاع رساني علوم و فناوري - A framework for parametric singing voice analysis/synthesis

DocumentCode :

2802323

Title :

A framework for parametric singing voice analysis/synthesis

Author :

Kim, Youngmoo E.

Author_Institution :

Media Lab., MIT, Cambridge, MA, USA

fYear :

2003

fDate :

19-22 Oct. 2003

Firstpage :

123

Lastpage :

126

Abstract :

The singing voice is the most variable and flexible of musical instruments. All voices are capable of producing the common phonemes necessary for language understanding and communication, yet each voice possesses distinctive qualities that are seemingly independent of phonemes and words. The unique acoustic qualities of an individual singer´s voice arise from a combination of innate physical factors (e.g., vocal tract and vocal fold physiology) and time-varying characteristics of performance (e.g., pronunciation and musical expression). This research introduces a framework for singing voice analysis/synthesis that takes both physical and expressive factors into account by estimating source-filter voice model parameters (representing the physiology) and modeling the dynamic behavior of these features over time using a hidden Markov model (to represent aspects of expression). Historically, source and filter model features have been calculated independently, but here they are estimated jointly for better modelling of source-filter dependencies common in singing. Additionally, the vocal tract filter is estimated on a warped frequency scale, which more accurately reflects the frequency sensitivity of human perception. This framework has many possible applications, including singing voice analysis/synthesis and singer identification.

Keywords :

audio signal processing; hidden Markov models; parameter estimation; speaker recognition; speech; audio signal processing; hidden Markov model; innate physical factors; musical expression; pronunciation; singer identification; singing voice analysis; singing voice synthesis; source-filter voice model parameter estimation; time-varying characteristics; vocal fold physiology; vocal tract physiology; Filters; Frequency estimation; Hidden Markov models; Humans; Instruments; Lips; Physiology; Shape; Speech analysis; Tongue;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on.

Print_ISBN :

0-7803-7850-4

Type :

conf

DOI :

10.1109/ASPAA.2003.1285835

Filename :

1285835

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2802323